[docs] Pipeline loading (#7684)

* pipelines * schedulers and models * community pipelines * feedback
2026-01-27 17:22:53 +03:00 · 2024-04-17 15:42:27 -07:00
parent 9132ce7c58
commit 7635d3d37f
5 changed files with 383 additions and 591 deletions
--- a/docs/source/en/_toctree.yml
+++ b/docs/source/en/_toctree.yml
@@ -24,14 +24,12 @@
  title: Tutorials
 - sections:
  - sections:
-    - local: using-diffusers/loading_overview
-      title: Overview
    - local: using-diffusers/loading
-      title: Load pipelines, models, and schedulers
-    - local: using-diffusers/schedulers
-      title: Load and compare different schedulers
+      title: Load pipelines
    - local: using-diffusers/custom_pipeline_overview
      title: Load community pipelines and components
+    - local: using-diffusers/schedulers
+      title: Load schedulers and models
    - local: using-diffusers/using_safetensors
      title: Load safetensors
    - local: using-diffusers/other-formats
--- a/docs/source/en/using-diffusers/custom_pipeline_overview.md
+++ b/docs/source/en/using-diffusers/custom_pipeline_overview.md
@@ -16,17 +16,19 @@ specific language governing permissions and limitations under the License.

 ## Community pipelines

-Community pipelines are any [`DiffusionPipeline`] class that are different from the original implementation as specified in their paper (for example, the [`StableDiffusionControlNetPipeline`] corresponds to the [Text-to-Image Generation with ControlNet Conditioning](https://arxiv.org/abs/2302.05543) paper). They provide additional functionality or extend the original implementation of a pipeline.
+Community pipelines are any [`DiffusionPipeline`] class that are different from the original paper implementation (for example, the [`StableDiffusionControlNetPipeline`] corresponds to the [Text-to-Image Generation with ControlNet Conditioning](https://arxiv.org/abs/2302.05543) paper). They provide additional functionality or extend the original implementation of a pipeline.

-There are many cool community pipelines like [Speech to Image](https://github.com/huggingface/diffusers/tree/main/examples/community#speech-to-image) or [Composable Stable Diffusion](https://github.com/huggingface/diffusers/tree/main/examples/community#composable-stable-diffusion), and you can find all the official community pipelines [here](https://github.com/huggingface/diffusers/tree/main/examples/community).
+There are many cool community pipelines like [Marigold Depth Estimation](https://github.com/huggingface/diffusers/tree/main/examples/community#marigold-depth-estimation) or [InstantID](https://github.com/huggingface/diffusers/tree/main/examples/community#instantid-pipeline), and you can find all the official community pipelines [here](https://github.com/huggingface/diffusers/tree/main/examples/community).

-To load any community pipeline on the Hub, pass the repository id of the community pipeline to the `custom_pipeline` argument and the model repository where you'd like to load the pipeline weights and components from. For example, the example below loads a dummy pipeline from [`hf-internal-testing/diffusers-dummy-pipeline`](https://huggingface.co/hf-internal-testing/diffusers-dummy-pipeline/blob/main/pipeline.py) and the pipeline weights and components from [`google/ddpm-cifar10-32`](https://huggingface.co/google/ddpm-cifar10-32):
+There are two types of community pipelines, those stored on the Hugging Face Hub and those stored on Diffusers GitHub repository. Hub pipelines are completely customizable (scheduler, models, pipeline code, etc.) while Diffusers GitHub pipelines are only limited to custom pipeline code. Refer to this [table](./contribute_pipeline#share-your-pipeline) for a more detailed comparison of Hub vs GitHub community pipelines.

-<Tip warning={true}>
+<hfoptions id="community">
+<hfoption id="Hub pipelines">

-🔒 By loading a community pipeline from the Hugging Face Hub, you are trusting that the code you are loading is safe. Make sure to inspect the code online before loading and running it automatically!
+To load a Hugging Face Hub community pipeline, pass the repository id of the community pipeline to the `custom_pipeline` argument and the model repository where you'd like to load the pipeline weights and components from. For example, the example below loads a dummy pipeline from [hf-internal-testing/diffusers-dummy-pipeline](https://huggingface.co/hf-internal-testing/diffusers-dummy-pipeline/blob/main/pipeline.py) and the pipeline weights and components from [google/ddpm-cifar10-32](https://huggingface.co/google/ddpm-cifar10-32):

-</Tip>
+> [!WARNING]
+> By loading a community pipeline from the Hugging Face Hub, you are trusting that the code you are loading is safe. Make sure to inspect the code online before loading and running it automatically!

 ```py
 from diffusers import DiffusionPipeline
@@ -36,7 +38,10 @@ pipeline = DiffusionPipeline.from_pretrained(
 )
 ```

-Loading an official community pipeline is similar, but you can mix loading weights from an official repository id and pass pipeline components directly. The example below loads the community [CLIP Guided Stable Diffusion](https://github.com/huggingface/diffusers/tree/main/examples/community#clip-guided-stable-diffusion) pipeline, and you can pass the CLIP model components directly to it:
+</hfoption>
+<hfoption id="GitHub pipelines">
+
+To load a GitHub community pipeline, pass the repository id of the community pipeline to the `custom_pipeline` argument and the model repository where you you'd like to load the pipeline weights and components from. You can also load model components directly. The example below loads the community [CLIP Guided Stable Diffusion](https://github.com/huggingface/diffusers/tree/main/examples/community#clip-guided-stable-diffusion) pipeline and the CLIP model components.

 ```py
 from diffusers import DiffusionPipeline
@@ -56,9 +61,12 @@ pipeline = DiffusionPipeline.from_pretrained(
 )
 ```

+</hfoption>
+</hfoptions>
+
 ### Load from a local file

-Community pipelines can also be loaded from a local file if you pass a file path instead. The path to the passed directory must contain a `pipeline.py` file that contains the pipeline class in order to successfully load it.
+Community pipelines can also be loaded from a local file if you pass a file path instead. The path to the passed directory must contain a pipeline.py file that contains the pipeline class.

 ```py
 pipeline = DiffusionPipeline.from_pretrained(
@@ -77,7 +85,7 @@ By default, community pipelines are loaded from the latest stable version of Dif
 <hfoptions id="version">
 <hfoption id="main">

-For example, to load from the `main` branch:
+For example, to load from the main branch:

 ```py
 pipeline = DiffusionPipeline.from_pretrained(
@@ -93,7 +101,7 @@ pipeline = DiffusionPipeline.from_pretrained(
 </hfoption>
 <hfoption id="older version">

-For example, to load from a previous version of Diffusers like `v0.25.0`:
+For example, to load from a previous version of Diffusers like v0.25.0:

 ```py
 pipeline = DiffusionPipeline.from_pretrained(
@@ -109,8 +117,49 @@ pipeline = DiffusionPipeline.from_pretrained(
 </hfoption>
 </hfoptions>

+### Load with from_pipe

-For more information about community pipelines, take a look at the [Community pipelines](custom_pipeline_examples) guide for how to use them and if you're interested in adding a community pipeline check out the [How to contribute a community pipeline](contribute_pipeline) guide!
+Community pipelines can also be loaded with the [`~DiffusionPipeline.from_pipe`] method which allows you to load and reuse multiple pipelines without any additional memory overhead (learn more in the [Reuse a pipeline](./loading#reuse-a-pipeline) guide). The memory requirement is determined by the largest single pipeline loaded.
+
+For example, let's load a community pipeline that supports [long prompts with weighting](https://github.com/huggingface/diffusers/tree/main/examples/community#long-prompt-weighting-stable-diffusion) from a Stable Diffusion pipeline.
+
+```py
+import torch
+from diffusers import DiffusionPipeline
+
+pipe_sd = DiffusionPipeline.from_pretrained("emilianJR/CyberRealistic_V3", torch_dtype=torch.float16)
+pipe_sd.to("cuda")
+# load long prompt weighting pipeline
+pipe_lpw = DiffusionPipeline.from_pipe(
+    pipe_sd,
+    custom_pipeline="lpw_stable_diffusion",
+).to("cuda")
+
+prompt = "cat, hiding in the leaves, ((rain)), zazie rainyday, beautiful eyes, macro shot, colorful details, natural lighting, amazing composition, subsurface scattering, amazing textures, filmic, soft light, ultra-detailed eyes, intricate details, detailed texture, light source contrast, dramatic shadows, cinematic light, depth of field, film grain, noise, dark background, hyperrealistic dslr film still, dim volumetric cinematic lighting"
+neg_prompt = "(deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime, mutated hands and fingers:1.4), (deformed, distorted, disfigured:1.3), poorly drawn, bad anatomy, wrong anatomy, extra limb, missing limb, floating limbs, disconnected limbs, mutation, mutated, ugly, disgusting, amputation"
+generator = torch.Generator(device="cpu").manual_seed(20)
+out_lpw = pipe_lpw(
+    prompt, 
+    negative_prompt=neg_prompt, 
+    width=512,
+    height=512,
+    max_embeddings_multiples=3, 
+    num_inference_steps=50,
+    generator=generator,
+    ).images[0]
+out_lpw
+```
+
+<div class="flex gap-4">
+  <div>
+    <img class="rounded-xl" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/from_pipe_lpw.png" />
+    <figcaption class="mt-2 text-center text-sm text-gray-500">Stable Diffusion with long prompt weighting</figcaption>
+  </div>
+  <div>
+    <img class="rounded-xl" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/from_pipe_non_lpw.png" />
+    <figcaption class="mt-2 text-center text-sm text-gray-500">Stable Diffusion</figcaption>
+  </div>
+</div>

 ## Community components

@@ -118,7 +167,7 @@ Community components allow users to build pipelines that may have customized com

 This section shows how users should use community components to build a community pipeline.

-You'll use the [showlab/show-1-base](https://huggingface.co/showlab/show-1-base) pipeline checkpoint as an example. So, let's start loading the components:
+You'll use the [showlab/show-1-base](https://huggingface.co/showlab/show-1-base) pipeline checkpoint as an example.

 1. Import and load the text encoder from Transformers:

@@ -152,17 +201,17 @@ In steps 4 and 5, the custom [UNet](https://github.com/showlab/Show-1/blob/main/

 </Tip>

-4. Now you'll load a [custom UNet](https://github.com/showlab/Show-1/blob/main/showone/models/unet_3d_condition.py), which in this example, has already been implemented in the `showone_unet_3d_condition.py` [script](https://huggingface.co/sayakpaul/show-1-base-with-code/blob/main/unet/showone_unet_3d_condition.py) for your convenience. You'll notice the `UNet3DConditionModel` class name is changed to `ShowOneUNet3DConditionModel` because [`UNet3DConditionModel`] already exists in Diffusers. Any components needed for the `ShowOneUNet3DConditionModel` class should be placed in the `showone_unet_3d_condition.py` script.
+4. Now you'll load a [custom UNet](https://github.com/showlab/Show-1/blob/main/showone/models/unet_3d_condition.py), which in this example, has already been implemented in [showone_unet_3d_condition.py](https://huggingface.co/sayakpaul/show-1-base-with-code/blob/main/unet/showone_unet_3d_condition.py) for your convenience. You'll notice the [`UNet3DConditionModel`] class name is changed to `ShowOneUNet3DConditionModel` because [`UNet3DConditionModel`] already exists in Diffusers. Any components needed for the `ShowOneUNet3DConditionModel` class should be placed in showone_unet_3d_condition.py.

-Once this is done, you can initialize the UNet:
+    Once this is done, you can initialize the UNet:

-```python
-from showone_unet_3d_condition import ShowOneUNet3DConditionModel
+    ```python
+    from showone_unet_3d_condition import ShowOneUNet3DConditionModel

-unet = ShowOneUNet3DConditionModel.from_pretrained(pipe_id, subfolder="unet")
-```
+    unet = ShowOneUNet3DConditionModel.from_pretrained(pipe_id, subfolder="unet")
+    ```

-5. Finally, you'll load the custom pipeline code. For this example, it has already been created for you in the `pipeline_t2v_base_pixel.py` [script](https://huggingface.co/sayakpaul/show-1-base-with-code/blob/main/pipeline_t2v_base_pixel.py). This script contains a custom `TextToVideoIFPipeline` class for generating videos from text. Just like the custom UNet, any code needed for the custom pipeline to work should go in the `pipeline_t2v_base_pixel.py` script. 
+5. Finally, you'll load the custom pipeline code. For this example, it has already been created for you in [pipeline_t2v_base_pixel.py](https://huggingface.co/sayakpaul/show-1-base-with-code/blob/main/pipeline_t2v_base_pixel.py). This script contains a custom `TextToVideoIFPipeline` class for generating videos from text. Just like the custom UNet, any code needed for the custom pipeline to work should go in pipeline_t2v_base_pixel.py.

 Once everything is in place, you can initialize the `TextToVideoIFPipeline` with the `ShowOneUNet3DConditionModel`:

@@ -187,13 +236,16 @@ Push the pipeline to the Hub to share with the community!
 pipeline.push_to_hub("custom-t2v-pipeline")
 ```

-After the pipeline is successfully pushed, you need a couple of changes:
+After the pipeline is successfully pushed, you need to make a few changes:

-1. Change the `_class_name` attribute in [`model_index.json`](https://huggingface.co/sayakpaul/show-1-base-with-code/blob/main/model_index.json#L2) to `"pipeline_t2v_base_pixel"` and `"TextToVideoIFPipeline"`.
-2. Upload `showone_unet_3d_condition.py` to the `unet` [directory](https://huggingface.co/sayakpaul/show-1-base-with-code/blob/main/unet/showone_unet_3d_condition.py).
-3. Upload `pipeline_t2v_base_pixel.py` to the pipeline base [directory](https://huggingface.co/sayakpaul/show-1-base-with-code/blob/main/unet/showone_unet_3d_condition.py).
+1. Change the `_class_name` attribute in [model_index.json](https://huggingface.co/sayakpaul/show-1-base-with-code/blob/main/model_index.json#L2) to `"pipeline_t2v_base_pixel"` and `"TextToVideoIFPipeline"`.
+2. Upload `showone_unet_3d_condition.py` to the [unet](https://huggingface.co/sayakpaul/show-1-base-with-code/blob/main/unet/showone_unet_3d_condition.py) subfolder.
+3. Upload `pipeline_t2v_base_pixel.py` to the pipeline [repository](https://huggingface.co/sayakpaul/show-1-base-with-code/tree/main).

-To run inference, simply add the `trust_remote_code` argument while initializing the pipeline to handle all the "magic" behind the scenes.
+To run inference, add the `trust_remote_code` argument while initializing the pipeline to handle all the "magic" behind the scenes.
+
+> [!WARNING]
+> As an additional precaution with `trust_remote_code=True`, we strongly encourage you to pass a commit hash to the `revision` parameter in [`~DiffusionPipeline.from_pretrained`] to make sure the code hasn't been updated with some malicious new lines of code (unless you fully trust the model owners).

 ```python
 from diffusers import DiffusionPipeline
@@ -221,10 +273,9 @@ video_frames = pipeline(
 ).frames
 ```

-As an additional reference example, you can refer to the repository structure of [stabilityai/japanese-stable-diffusion-xl](https://huggingface.co/stabilityai/japanese-stable-diffusion-xl/), that makes use of the `trust_remote_code` feature:
+As an additional reference, take a look at the repository structure of [stabilityai/japanese-stable-diffusion-xl](https://huggingface.co/stabilityai/japanese-stable-diffusion-xl/) which also uses the `trust_remote_code` feature.

 ```python
-
 from diffusers import DiffusionPipeline
 import torch

@@ -232,14 +283,4 @@ pipeline = DiffusionPipeline.from_pretrained(
    "stabilityai/japanese-stable-diffusion-xl", trust_remote_code=True
 )
 pipeline.to("cuda")
-
-# if using torch < 2.0
-# pipeline.enable_xformers_memory_efficient_attention()
-
-prompt = "柴犬、カラフルアート"
-
-image = pipeline(prompt=prompt).images[0]
 ```
-
-> [!TIP]
-> When using `trust_remote_code=True`, it is also strongly encouraged to pass a commit hash as a `revision` to make sure the author of the models did not update the code with some malicious new lines (unless you fully trust the authors of the models).
--- a/docs/source/en/using-diffusers/loading.md
+++ b/docs/source/en/using-diffusers/loading.md
@@ -10,57 +10,75 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
 specific language governing permissions and limitations under the License.
 -->

-# Load pipelines, models, and schedulers
+# Load pipelines

 [[open-in-colab]]

-Having an easy way to use a diffusion system for inference is essential to 🧨 Diffusers. Diffusion systems often consist of multiple components like parameterized models, tokenizers, and schedulers that interact in complex ways. That is why we designed the [`DiffusionPipeline`] to wrap the complexity of the entire diffusion system into an easy-to-use API, while remaining flexible enough to be adapted for other use cases, such as loading each component individually as building blocks to assemble your own diffusion system.
-
-Everything you need for inference or training is accessible with the `from_pretrained()` method.
+Diffusion systems consist of multiple components like parameterized models and schedulers that interact in complex ways. That is why we designed the [`DiffusionPipeline`] to wrap the complexity of the entire diffusion system into an easy-to-use API. At the same time, the [`DiffusionPipeline`] is entirely customizable so you can modify each component to build a diffusion system for your use case.

 This guide will show you how to load:

 - pipelines from the Hub and locally
 - different components into a pipeline
+- multiple pipelines without increasing memory usage
 - checkpoint variants such as different floating point types or non-exponential mean averaged (EMA) weights
- models and schedulers

-## Diffusion Pipeline
+## Load a pipeline

-<Tip>
+> [!TIP]
+> Skip to the [DiffusionPipeline explained](#diffusionpipeline-explained) section if you're interested in an explanation about how the [`DiffusionPipeline`] class works.

-💡 Skip to the [DiffusionPipeline explained](#diffusionpipeline-explained) section if you are interested in learning in more detail about how the [`DiffusionPipeline`] class works.
+There are two ways to load a pipeline for a task:

-</Tip>
+1. Load the generic [`DiffusionPipeline`] class and allow it to automatically detect the correct pipeline class from the checkpoint.
+2. Load a specific pipeline class for a specific task.

-The [`DiffusionPipeline`] class is the simplest and most generic way to load the latest trending diffusion model from the [Hub](https://huggingface.co/models?library=diffusers&sort=trending). The [`DiffusionPipeline.from_pretrained`] method automatically detects the correct pipeline class from the checkpoint, downloads, and caches all the required configuration and weight files, and returns a pipeline instance ready for inference.
+<hfoptions id="pipelines">
+<hfoption id="generic pipeline">
+
+The [`DiffusionPipeline`] class is a simple and generic way to load the latest trending diffusion model from the [Hub](https://huggingface.co/models?library=diffusers&sort=trending). It uses the [`~DiffusionPipeline.from_pretrained`] method to automatically detect the correct pipeline class for a task from the checkpoint, downloads and caches all the required configuration and weight files, and returns a pipeline ready for inference.

 ```python
 from diffusers import DiffusionPipeline

-repo_id = "runwayml/stable-diffusion-v1-5"
-pipe = DiffusionPipeline.from_pretrained(repo_id, use_safetensors=True)
+pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", use_safetensors=True)
 ```

-You can also load a checkpoint with its specific pipeline class. The example above loaded a Stable Diffusion model; to get the same result, use the [`StableDiffusionPipeline`] class:
+This same checkpoint can also be used for an image-to-image task. The [`DiffusionPipeline`] class can handle any task as long as you provide the appropriate inputs. For example, for an image-to-image task, you need to pass an initial image to the pipeline.
+
+```py
+from diffusers import DiffusionPipeline
+
+pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", use_safetensors=True)
+
+init_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/img2img-init.png")
+prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
+image = pipeline("Astronaut in a jungle, cold color palette, muted colors, detailed, 8k", image=init_image).images[0]
+```
+
+</hfoption>
+<hfoption id="specific pipeline">
+
+Checkpoints can be loaded by their specific pipeline class if you already know it. For example, to load a Stable Diffusion model, use the [`StableDiffusionPipeline`] class.

 ```python
 from diffusers import StableDiffusionPipeline

-repo_id = "runwayml/stable-diffusion-v1-5"
-pipe = StableDiffusionPipeline.from_pretrained(repo_id, use_safetensors=True)
+pipeline = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", use_safetensors=True)
 ```

-A checkpoint (such as [`CompVis/stable-diffusion-v1-4`](https://huggingface.co/CompVis/stable-diffusion-v1-4) or [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5)) may also be used for more than one task, like text-to-image or image-to-image. To differentiate what task you want to use the checkpoint for, you have to load it directly with its corresponding task-specific pipeline class:
+This same checkpoint may also be used for another task like image-to-image. To differentiate what task you want to use the checkpoint for, you have to use the corresponding task-specific pipeline class. For example, to use the same checkpoint for image-to-image, use the [`StableDiffusionImg2ImgPipeline`] class.

-```python
+```py
 from diffusers import StableDiffusionImg2ImgPipeline

-repo_id = "runwayml/stable-diffusion-v1-5"
-pipe = StableDiffusionImg2ImgPipeline.from_pretrained(repo_id)
+pipeline = StableDiffusionImg2ImgPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", use_safetensors=True)
 ```

-You can use the Space below to gauge the memory requirements of a pipeline you want to load beforehand without downloading the pipeline checkpoints:
+</hfoption>
+</hfoptions>
+
+Use the Space below to gauge a pipeline's memory requirements before you download and load it to see if it runs on your hardware.

 <div class="block dark:hidden">
 	<iframe 
@@ -79,113 +97,66 @@ You can use the Space below to gauge the memory requirements of a pipeline you w

 ### Local pipeline

-To load a diffusion pipeline locally, use [`git-lfs`](https://git-lfs.github.com/) to manually download the checkpoint (in this case, [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5)) to your local disk. This creates a local folder, `./stable-diffusion-v1-5`, on your disk:
+To load a pipeline locally, use [git-lfs](https://git-lfs.github.com/) to manually download a checkpoint to your local disk.

 ```bash
 git-lfs install
 git clone https://huggingface.co/runwayml/stable-diffusion-v1-5
 ```

-Then pass the local path to [`~DiffusionPipeline.from_pretrained`]:
+This creates a local folder, ./stable-diffusion-v1-5, on your disk and you should pass its path to [`~DiffusionPipeline.from_pretrained`].

 ```python
 from diffusers import DiffusionPipeline

-repo_id = "./stable-diffusion-v1-5"
-stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, use_safetensors=True)
+stable_diffusion = DiffusionPipeline.from_pretrained("./stable-diffusion-v1-5", use_safetensors=True)
 ```

-The [`~DiffusionPipeline.from_pretrained`] method won't download any files from the Hub when it detects a local path, but this also means it won't download and cache the latest changes to a checkpoint.
+The [`~DiffusionPipeline.from_pretrained`] method won't download files from the Hub when it detects a local path, but this also means it won't download and cache the latest changes to a checkpoint.

-### Swap components in a pipeline
+## Customize a pipeline

-You can customize the default components of any pipeline with another compatible component. Customization is important because:
+You can customize a pipeline by loading different components into it. This is important because you can:

- Changing the scheduler is important for exploring the trade-off between generation speed and quality.
- Different components of a model are typically trained independently and you can swap out a component with a better-performing one.
- During finetuning, usually only some components - like the UNet or text encoder - are trained.
+- change to a scheduler with faster generation speed or higher generation quality depending on your needs (call the `scheduler.compatibles` method on your pipeline to see compatible schedulers)
+- change a default pipeline component to a newer and better performing one

-To find out which schedulers are compatible for customization, you can use the `compatibles` method:
+For example, let's customize the default [stabilityai/stable-diffusion-xl-base-1.0](https://hf.co/stabilityai/stable-diffusion-xl-base-1.0) checkpoint with:
+
+- The [`HeunDiscreteScheduler`] to generate higher quality images at the expense of slower generation speed. You must pass the `subfolder="scheduler"` parameter in [`~HeunDiscreteScheduler.from_pretrained`] to load the scheduler configuration into the correct [subfolder](https://hf.co/stabilityai/stable-diffusion-xl-base-1.0/tree/main/scheduler) of the pipeline repository.
+- A more stable VAE that runs in fp16.

 ```py
-from diffusers import DiffusionPipeline
+from diffusers import StableDiffusionXLPipeline, HeunDiscreteScheduler, AutoencoderKL
+import torch

-repo_id = "runwayml/stable-diffusion-v1-5"
-stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, use_safetensors=True)
-stable_diffusion.scheduler.compatibles
+scheduler = HeunDiscreteScheduler.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="scheduler")
+vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16, use_safetensors=True)
 ```

-Let's use the [`SchedulerMixin.from_pretrained`] method to replace the default [`PNDMScheduler`] with a more performant scheduler, [`EulerDiscreteScheduler`]. The `subfolder="scheduler"` argument is required to load the scheduler configuration from the correct [subfolder](https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/main/scheduler) of the pipeline repository.
-
-Then you can pass the new [`EulerDiscreteScheduler`] instance to the `scheduler` argument in [`DiffusionPipeline`]:
-
-```python
-from diffusers import DiffusionPipeline, EulerDiscreteScheduler
-
-repo_id = "runwayml/stable-diffusion-v1-5"
-scheduler = EulerDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler")
-stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, scheduler=scheduler, use_safetensors=True)
-```
-
-### Safety checker
-
-Diffusion models like Stable Diffusion can generate harmful content, which is why 🧨 Diffusers has a [safety checker](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/safety_checker.py) to check generated outputs against known hardcoded NSFW content. If you'd like to disable the safety checker for whatever reason, pass `None` to the `safety_checker` argument:
-
-```python
-from diffusers import DiffusionPipeline
-
-repo_id = "runwayml/stable-diffusion-v1-5"
-stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, safety_checker=None, use_safetensors=True)
-"""
-You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide by the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend keeping the safety filter enabled in all public-facing circumstances, disabling it only for use cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
-"""
-```
-
-### Reuse components across pipelines
-
-You can also reuse the same components in multiple pipelines to avoid loading the weights into RAM twice. Use the [`~DiffusionPipeline.components`] method to save the components:
-
-```python
-from diffusers import StableDiffusionPipeline, StableDiffusionImg2ImgPipeline
-
-model_id = "runwayml/stable-diffusion-v1-5"
-stable_diffusion_txt2img = StableDiffusionPipeline.from_pretrained(model_id, use_safetensors=True)
-
-components = stable_diffusion_txt2img.components
-```
-
-Then you can pass the `components` to another pipeline without reloading the weights into RAM:
+Now pass the new scheduler and VAE to the [`StableDiffusionXLPipeline`].

 ```py
-stable_diffusion_img2img = StableDiffusionImg2ImgPipeline(**components)
+pipeline = StableDiffusionXLPipeline.from_pretrained(
+  "stabilityai/stable-diffusion-xl-base-1.0", 
+  scheduler=scheduler, 
+  vae=vae, 
+  torch_dtype=torch.float16, 
+  variant="fp16", 
+  use_safetensors=True
+).to("cuda")
 ```

-You can also pass the components individually to the pipeline if you want more flexibility over which components to reuse or disable. For example, to reuse the same components in the text-to-image pipeline, except for the safety checker and feature extractor, in the image-to-image pipeline:
+## Reuse a pipeline

-```py
-from diffusers import StableDiffusionPipeline, StableDiffusionImg2ImgPipeline
+When you load multiple pipelines that share the same model components, it makes sense to reuse the shared components instead of reloading everything into memory again, especially if your hardware is memory-constrained. For example:

-model_id = "runwayml/stable-diffusion-v1-5"
-stable_diffusion_txt2img = StableDiffusionPipeline.from_pretrained(model_id, use_safetensors=True)
-stable_diffusion_img2img = StableDiffusionImg2ImgPipeline(
-    vae=stable_diffusion_txt2img.vae,
-    text_encoder=stable_diffusion_txt2img.text_encoder,
-    tokenizer=stable_diffusion_txt2img.tokenizer,
-    unet=stable_diffusion_txt2img.unet,
-    scheduler=stable_diffusion_txt2img.scheduler,
-    safety_checker=None,
-    feature_extractor=None,
-    requires_safety_checker=False,
-)
-```
+1. You generated an image with the [`StableDiffusionPipeline`] but you want to improve its quality with the [`StableDiffusionSAGPipeline`]. Both of these pipelines share the same pretrained model, so it'd be a waste of memory to load the same model twice.
+2. You want to add a model component, like a [`MotionAdapter`](../api/pipelines/animatediff#animatediffpipeline), to [`AnimateDiffPipeline`] which was instantiated from an existing [`StableDiffusionPipeline`]. Again, both pipelines share the same pretrained model, so it'd be a waste of memory to load an entirely new pipeline again.

-### Switch loaded pippelines
+With the [`DiffusionPipeline.from_pipe`] API, you can switch between multiple pipelines to take advantage of their different features without increasing memory-usage. It is similar to turning on and off a feature in your pipeline. To switch between tasks, use the [`~DiffusionPipeline.from_pipe`] method with the [`AutoPipeline`](../api/pipelines/auto_pipeline) class, which automatically identifies the pipeline class based on the task (learn more in the [AutoPipeline](../tutorials/autopipeline) tutorial).

-There are many diffuser pipelines that use the same pre-trained model as [`StableDiffusionPipeline`] and [`StableDiffusionXLPipeline`], but they implement specific features to help you achieve better generation results. This guide will show you how to use the `from_pipe` API to create multiple pipelines without increasing memory usage. By using this approach, you can easily switch between pipelines to use different features.
-
-Let's take an example where we first create a [`StableDiffusionPipeline`] and then reuse the already loaded model components to create a [`StableDiffusionSAGPipeline`] to enhance generation quality.
-
-we will generate an image of a bear eating pizza using Stable Diffusion with the IP-Adapter
+Let's start with a [`StableDiffusionPipeline`] and then reuse the loaded model components to create a [`StableDiffusionSAGPipeline`] to increase generation quality. You'll use the [`StableDiffusionPipeline`] with an [IP-Adapter](./ip_adapter) to generate a bear eating pizza.

 ```python
 from diffusers import DiffusionPipeline, StableDiffusionSAGPipeline
@@ -194,123 +165,85 @@ import gc
 from diffusers.utils import load_image
 from accelerate.utils import compute_module_sizes

-base_repo = "SG161222/Realistic_Vision_V6.0_B1_noVAE"
-num_inference_steps = 50
 image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/load_neg_embed.png")
-prompt="bear eats pizza"
-negative_prompt = "wrong white balance, dark, sketches,worst quality,low quality"

-pipe_sd = DiffusionPipeline.from_pretrained(base_repo, torch_dtype=torch.float16)
+pipe_sd = DiffusionPipeline.from_pretrained("SG161222/Realistic_Vision_V6.0_B1_noVAE", torch_dtype=torch.float16)
 pipe_sd.load_ip_adapter("h94/IP-Adapter", subfolder="models", weight_name="ip-adapter_sd15.bin")
 pipe_sd.set_ip_adapter_scale(0.6)
 pipe_sd.to("cuda")

 generator = torch.Generator(device="cpu").manual_seed(33)
 out_sd = pipe_sd(
-    prompt=prompt,
-    negative_prompt=negative_prompt, 
+    prompt="bear eats pizza",
+    negative_prompt="wrong white balance, dark, sketches,worst quality,low quality", 
    ip_adapter_image=image,
-    num_inference_steps=num_inference_steps,
+    num_inference_steps=50,
    generator=generator,
 ).images[0]
+out_sd
 ```

-let’s take a look at the image and also print out the memory used 
-
 <div class="flex justify-center">
  <img class="rounded-xl" src="https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/from_pipe_out_sd_0.png"/>
 </div>

+For reference, you can check how much memory this process consumed.
+
 ```python
 def bytes_to_giga_bytes(bytes):
    return bytes / 1024 / 1024 / 1024
-print(
-    f"Max memory allocated: {bytes_to_giga_bytes(torch.cuda.max_memory_allocated())} GB"
-)
+print(f"Max memory allocated: {bytes_to_giga_bytes(torch.cuda.max_memory_allocated())} GB")
+"Max memory allocated: 4.406213283538818 GB"
 ```

-```bash
-Max memory allocated: 4.406213283538818 GB
-```
+Now, reuse the same pipeline components from [`StableDiffusionPipeline`] in [`StableDiffusionSAGPipeline`] with the [`~DiffusionPipeline.from_pipe`] method.

-Now, we can use `from_pipe` to switch to the SAG pipeline. 
+> [!WARNING]
+> Some pipeline methods may not function properly on new pipelines created with [`~DiffusionPipeline.from_pipe`]. For instance, the [`~DiffusionPipeline.enable_model_cpu_offload`] method installs hooks on the model components based on a unique offloading sequence for each pipeline. If the models are executed in a different order in the new pipeline, the CPU offloading may not work correctly.
+>
+> To ensure everything works as expected, we recommend re-applying a pipeline method on a new pipeline created with [`~DiffusionPipeline.from_pipe`].

 ```python
 pipe_sag = StableDiffusionSAGPipeline.from_pipe(
-    pipe_sd,
+    pipe_sd
 )
-```

-It already has IP-Adapter loaded so that you can pass the same bear image as `ip_adapter_image`
-
-```python
 generator = torch.Generator(device="cpu").manual_seed(33)
 out_sag = pipe_sag(
-    prompt = prompt, 
-    negative_prompt=negative_prompt, 
+    prompt="bear eats pizza",
+    negative_prompt="wrong white balance, dark, sketches,worst quality,low quality",
    ip_adapter_image=image,
-    num_inference_steps=num_inference_steps,
+    num_inference_steps=50,
    generator=generator,
    guidance_scale=1.0,
-    sag_scale=0.75).images[0]
+    sag_scale=0.75
+).images[0]
+out_sag
 ```

-You can see a pretty nice improvement in the output
-
 <div class="flex justify-center">
  <img class="rounded-xl" src="https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/from_pipe_out_sag_1.png"/>
 </div>

-Now we have both `stableDiffusionPipeline` and `StableDiffusionSAGPipeline` co-existing with the same loaded model components;  You can use them interchangeably without additional memory.
+If you check the memory usage, you'll see it remains the same as before because [`StableDiffusionPipeline`] and [`StableDiffusionSAGPipeline`] are sharing the same pipeline components. This allows you to use them interchangeably without any additional memory overhead.

-```
-print(
-    f"Max memory allocated: {bytes_to_giga_bytes(torch.cuda.max_memory_allocated())} GB"
-)
+```py
+print(f"Max memory allocated: {bytes_to_giga_bytes(torch.cuda.max_memory_allocated())} GB")
+"Max memory allocated: 4.406213283538818 GB"
 ```

-```bash
-Max memory allocated: 4.406213283538818 GB
-```
+Let's animate the image with the [`AnimateDiffPipeline`] and also add a [`MotionAdapter`] module to the pipeline. For the [`AnimateDiffPipeline`], you need to unload the IP-Adapter first and reload it *after* you've created your new pipeline (this only applies to the [`AnimateDiffPipeline`]).

-Let's unload the IP adapter from the SAG pipeline. It's important to note that methods like `load_ip_adapter` and `unload_ip_adapter` modify the state of the model components. Therefore, when you use these methods on one pipeline, it will affect all other pipelines that share the same model components.
-
-```bash
-pipe_sag.unload_ip_adapter()
-```
-
-If you try to use the Stable Diffusion pipeline with IP adapter again, it will fail
-
-```bash
-generator = torch.Generator(device="cpu").manual_seed(33)
-out_sd = pipe_sd(
-    prompt=prompt,
-    negative_prompt=negative_prompt, 
-    ip_adapter_image=image,
-    num_inference_steps=num_inference_steps,
-    generator=generator,
-).images[0]
-```
-
-```bash
-AttributeError: 'NoneType' object has no attribute 'image_projection_layers'
-```
-
-Please note that the pipeline methods may not function properly on a new pipeline created using the `from_pipe` method. For instance, the `enable_model_cpu_offload` method installs hooks to the model components based on a unique offloading sequence for each pipeline. Therefore, if the models are executed in a different order in the new pipeline, the CPU offloading may not work correctly.
-
-To ensure proper functionality, we recommend re-applying the pipeline methods on the new pipeline created using the `from_pipe` method.
-
-You can also add or subtract model components when you create new pipelines. Let's now create a AnimateDiff pipeline with an additional `MotionAdapter` module
-
-```bash
+```py
 from diffusers import AnimateDiffPipeline, MotionAdapter, DDIMScheduler
 from diffusers.utils import export_to_gif

+pipe_sag.unload_ip_adapter()
 adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2", torch_dtype=torch.float16)

 pipe_animate = AnimateDiffPipeline.from_pipe(pipe_sd, motion_adapter=adapter)
 pipe_animate.scheduler = DDIMScheduler.from_config(pipe_animate.scheduler.config, beta_schedule="linear")
-# load ip_adapter again and load lora weights
+# load IP-Adapter and LoRA weights again
 pipe_animate.load_ip_adapter("h94/IP-Adapter", subfolder="models", weight_name="ip-adapter_sd15.bin")
 pipe_animate.load_lora_weights("guoyww/animatediff-motion-lora-zoom-out", adapter_name="zoom-out")
 pipe_animate.to("cuda")
@@ -318,229 +251,153 @@ pipe_animate.to("cuda")
 generator = torch.Generator(device="cpu").manual_seed(33)
 pipe_animate.set_adapters("zoom-out", adapter_weights=0.75)
 out = pipe_animate(
-    prompt= prompt,
+    prompt="bear eats pizza",
    num_frames=16,
-    num_inference_steps=num_inference_steps,
-    ip_adapter_image = image,
+    num_inference_steps=50,
+    ip_adapter_image=image,
    generator=generator,
 ).frames[0]
 export_to_gif(out, "out_animate.gif")
 ```
+
 <div class="flex justify-center">
  <img class="rounded-xl" src="https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/from_pipe_out_animate_3.gif"/>
 </div>

+The [`AnimateDiffPipeline`] is more memory-intensive and consumes 15GB of memory (see the [Memory-usage of from_pipe](#memory-usage-of-from_pipe) section to learn what this means for your memory-usage).

-When creating multiple pipelines using the `from_pipe` method, it is important to note that the memory requirement will be determined by the pipeline with the highest memory usage. This means that regardless of the number of pipelines you create, the total memory requirement will always be the same as the highest memory requirement among the pipelines.
-
-For example, we have created three pipelines - `stableDiffusionPipeline`, `StableDiffusionSAGPipeline`, and `AnimateDiffPipeline` - and the `AnimateDiffPipeline` has the highest memory requirement, then the total memory usage will be based on the memory requirement of the `AnimateDiffPipeline`. 
-
-Therefore, creating additional pipelines will not add up to the total memory requirement. Each pipeline can be used interchangeably without any additional memory overhead.
-
-
-Did you know that you can use `from_pipe` with a community pipeline? Let me show you an example of using long negative prompt and prompt weighting!
-
-```bash
-pipe_lpw = DiffusionPipeline.from_pipe(
-    pipe_sd,
-    custom_pipeline="lpw_stable_diffusion",
-).to("cuda")
-
-prompt = "best_quality (1girl:1.3) bow bride brown_hair closed_mouth frilled_bow frilled_hair_tubes frills (full_body:1.3) fox_ear hair_bow hair_tubes happy hood japanese_clothes kimono long_sleeves red_bow smile solo tabi uchikake white_kimono wide_sleeves cherry_blossoms"
-neg_prompt = "lowres, bad_anatomy, error_body, error_hair, error_arm, error_hands, bad_hands, error_fingers, bad_fingers, missing_fingers, error_legs, bad_legs, multiple_legs, missing_legs, error_lighting, error_shadow, error_reflection, text, error, extra_digit, fewer_digits, cropped, worst_quality, low_quality, normal_quality, jpeg_artifacts, signature, watermark, username, blurry"
-generator = torch.Generator(device="cpu").manual_seed(33)
-out_lpw = pipe_lpw.text2img(
-    prompt, 
-    negative_prompt=neg_prompt, 
-    width=512,height=512,
-    max_embeddings_multiples=3, 
-    num_inference_steps=num_inference_steps,
-    generator=generator,
-    ).images[0]
+```py
+print(f"Max memory allocated: {bytes_to_giga_bytes(torch.cuda.max_memory_allocated())} GB")
+"Max memory allocated: 15.178664207458496 GB"
 ```

-<div class="flex justify-center">
-  <img class="rounded-xl" src="https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/from_pipe_out_lpw_4.png"/>
-</div>
+### Modify from_pipe components

-let’s run StableDiffusionPipeline with the same inputs to compare:  the result from the long prompt weighting pipeline is more aligned with the text prompt.
+Pipelines loaded with [`~DiffusionPipeline.from_pipe`] can be customized with different model components or methods. However, whenever you modify the *state* of the model components, it affects all the other pipelines that share the same components. For example, if you call [`~diffusers.loaders.IPAdapterMixin.unload_ip_adapter`] on the [`StableDiffusionSAGPipeline`], you won't be able to use IP-Adapter with the [`StableDiffusionPipeline`] because it's been removed from their shared components.
+
+```py
+pipe.sag_unload_ip_adapter()

-```
 generator = torch.Generator(device="cpu").manual_seed(33)
 out_sd = pipe_sd(
-    prompt=prompt,
-    negative_prompt=negative_prompt,
+    prompt="bear eats pizza",
+    negative_prompt="wrong white balance, dark, sketches,worst quality,low quality", 
+    ip_adapter_image=image,
+    num_inference_steps=50,
    generator=generator,
-    num_inference_steps=num_inference_steps,
 ).images[0]
-out_sd
+"AttributeError: 'NoneType' object has no attribute 'image_projection_layers'"
 ```
-<div class="flex justify-center">
-  <img class="rounded-xl" src="https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/from_pipe_out_sd_5.png"/>
-</div>

+### Memory usage of from_pipe

-You can easily switch between different pipelines using the `from_pipe` method, similar to turning on and off a feature on your pipeline. To switch between tasks, you can use the `from_pipe` method with `AutoPipeline`, which automatically identifies the pipeline class based on the task. You can find more information about this feature at the [AutoPipe Guide](https://huggingface.co/docs/diffusers/tutorials/autopipeline).
+The memory requirement of loading multiple pipelines with [`~DiffusionPipeline.from_pipe`] is determined by the pipeline with the highest memory-usage regardless of the number of pipelines you create.

+| Pipeline | Memory usage (GB) |
+|---|---|
+| StableDiffusionPipeline | 4.400 |
+| StableDiffusionSAGPipeline | 4.400 |
+| AnimateDiffPipeline | 15.178 |
+
+The [`AnimateDiffPipeline`] has the highest memory requirement, so the *total memory-usage* is based only on the [`AnimateDiffPipeline`]. Your memory-usage will not increase if you create additional pipelines as long as their memory requirements doesn't exceed that of the [`AnimateDiffPipeline`]. Each pipeline can be used interchangeably without any additional memory overhead.
+
+## Safety checker
+
+Diffusers implements a [safety checker](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/safety_checker.py) for Stable Diffusion models which can generate harmful content. The safety checker screens the generated output against known hardcoded not-safe-for-work (NSFW) content. If for whatever reason you'd like to disable the safety checker, pass `safety_checker=None` to the [`~DiffusionPipeline.from_pretrained`] method.
+
+```python
+from diffusers import DiffusionPipeline
+
+pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", safety_checker=None, use_safetensors=True)
+"""
+You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide by the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend keeping the safety filter enabled in all public-facing circumstances, disabling it only for use cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
+"""
+```

 ## Checkpoint variants

 A checkpoint variant is usually a checkpoint whose weights are:

- Stored in a different floating point type for lower precision and lower storage, such as [`torch.float16`](https://pytorch.org/docs/stable/tensors.html#data-types), because it only requires half the bandwidth and storage to download. You can't use this variant if you're continuing training or using a CPU.
- Non-exponential mean averaged (EMA) weights, which shouldn't be used for inference. You should use these to continue fine-tuning a model.
+- Stored in a different floating point type, such as [torch.float16](https://pytorch.org/docs/stable/tensors.html#data-types), because it only requires half the bandwidth and storage to download. You can't use this variant if you're continuing training or using a CPU.
+- Non-exponential mean averaged (EMA) weights which shouldn't be used for inference. You should use this variant to continue finetuning a model.

-<Tip>
+> [!TIP]
+> When the checkpoints have identical model structures, but they were trained on different datasets and with a different training setup, they should be stored in separate repositories. For example, [stabilityai/stable-diffusion-2](https://hf.co/stabilityai/stable-diffusion-2) and [stabilityai/stable-diffusion-2-1](https://hf.co/stabilityai/stable-diffusion-2-1) are stored in separate repositories.

-💡 When the checkpoints have identical model structures, but they were trained on different datasets and with a different training setup, they should be stored in separate repositories instead of variations (for example, [`stable-diffusion-v1-4`] and [`stable-diffusion-v1-5`]).
+Otherwise, a variant is **identical** to the original checkpoint. They have exactly the same serialization format (like [safetensors](./using_safetensors)), model structure, and their weights have identical tensor shapes.

-</Tip>
+| **checkpoint type** | **weight name**                             | **argument for loading weights** |
+|---------------------|---------------------------------------------|----------------------------------|
+| original            | diffusion_pytorch_model.safetensors         |                                  |
+| floating point      | diffusion_pytorch_model.fp16.safetensors    | `variant`, `torch_dtype`         |
+| non-EMA             | diffusion_pytorch_model.non_ema.safetensors | `variant`                        |

-Otherwise, a variant is **identical** to the original checkpoint. They have exactly the same serialization format (like [Safetensors](./using_safetensors)), model structure, and weights that have identical tensor shapes.
+There are two important arguments for loading variants:

-| **checkpoint type** | **weight name**                     | **argument for loading weights** |
-|---------------------|-------------------------------------|----------------------------------|
-| original            | diffusion_pytorch_model.bin         |                                  |
-| floating point      | diffusion_pytorch_model.fp16.bin    | `variant`, `torch_dtype`         |
-| non-EMA             | diffusion_pytorch_model.non_ema.bin | `variant`                        |
+- `torch_dtype` specifies the floating point precision of the loaded checkpoint. For example, if you want to save bandwidth by loading a fp16 variant, you should set `variant="fp16"` and `torch_dtype=torch.float16` to *convert the weights* to fp16. Otherwise, the fp16 weights are converted to the default fp32 precision.

-There are two important arguments to know for loading variants:
+  If you only set `torch_dtype=torch.float16`, the default fp32 weights are downloaded first and then converted to fp16.

- `torch_dtype` defines the floating point precision of the loaded checkpoints. For example, if you want to save bandwidth by loading a `fp16` variant, you should specify `torch_dtype=torch.float16` to *convert the weights* to `fp16`. Otherwise, the `fp16` weights are converted to the default `fp32` precision. You can also load the original checkpoint without defining the `variant` argument, and convert it to `fp16` with `torch_dtype=torch.float16`. In this case, the default `fp32` weights are downloaded first, and then they're converted to `fp16` after loading.
+- `variant` specifies which files should be loaded from the repository. For example, if you want to load a non-EMA variant of a UNet from [runwayml/stable-diffusion-v1-5](https://hf.co/runwayml/stable-diffusion-v1-5/tree/main/unet), set `variant="non_ema"` to download the `non_ema` file.

- `variant` defines which files should be loaded from the repository. For example, if you want to load a `non_ema` variant from the [`diffusers/stable-diffusion-variants`](https://huggingface.co/diffusers/stable-diffusion-variants/tree/main/unet) repository, you should specify `variant="non_ema"` to download the `non_ema` files.
+<hfoptions id="variants">
+<hfoption id="fp16">

-```python
+```py
 from diffusers import DiffusionPipeline
 import torch

-# load fp16 variant
-stable_diffusion = DiffusionPipeline.from_pretrained(
+pipeline = DiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", variant="fp16", torch_dtype=torch.float16, use_safetensors=True
 )
-# load non_ema variant
-stable_diffusion = DiffusionPipeline.from_pretrained(
+```
+
+</hfoption>
+<hfoption id="non-EMA">
+
+```py
+pipeline = DiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", variant="non_ema", use_safetensors=True
 )
 ```

-To save a checkpoint stored in a different floating-point type or as a non-EMA variant, use the [`DiffusionPipeline.save_pretrained`] method and specify the `variant` argument. You should try and save a variant to the same folder as the original checkpoint, so you can load both from the same folder:
+</hfoption>
+</hfoptions>
+
+Use the `variant` parameter in the [`DiffusionPipeline.save_pretrained`] method to save a checkpoint as a different floating point type or as a non-EMA variant. You should try save a variant to the same folder as the original checkpoint, so you have the option of loading both from the same folder.
+
+<hfoptions id="save">
+<hfoption id="fp16">

 ```python
 from diffusers import DiffusionPipeline

-# save as fp16 variant
-stable_diffusion.save_pretrained("runwayml/stable-diffusion-v1-5", variant="fp16")
-# save as non-ema variant
-stable_diffusion.save_pretrained("runwayml/stable-diffusion-v1-5", variant="non_ema")
+pipeline.save_pretrained("runwayml/stable-diffusion-v1-5", variant="fp16")
 ```

-If you don't save the variant to an existing folder, you must specify the `variant` argument otherwise it'll throw an `Exception` because it can't find the original checkpoint:
+</hfoption>
+<hfoption id="non_ema">
+
+```py
+pipeline.save_pretrained("runwayml/stable-diffusion-v1-5", variant="non_ema")
+```
+
+</hfoption>
+</hfoptions>
+
+If you don't save the variant to an existing folder, you must specify the `variant` argument otherwise it'll throw an `Exception` because it can't find the original checkpoint.

 ```python
 # 👎 this won't work
-stable_diffusion = DiffusionPipeline.from_pretrained(
+pipeline = DiffusionPipeline.from_pretrained(
    "./stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True
 )
 # 👍 this works
-stable_diffusion = DiffusionPipeline.from_pretrained(
+pipeline = DiffusionPipeline.from_pretrained(
    "./stable-diffusion-v1-5", variant="fp16", torch_dtype=torch.float16, use_safetensors=True
 )
 ```

-<!--
-TODO(Patrick) - Make sure to uncomment this part as soon as things are deprecated.
-
-#### Using `revision` to load pipeline variants is deprecated
-
-Previously the `revision` argument of [`DiffusionPipeline.from_pretrained`] was heavily used to
-load model variants, e.g.:
-
-```python
-from diffusers import DiffusionPipeline
-
-pipe = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", revision="fp16", use_safetensors=True)
-```
-
-However, this behavior is now deprecated since the "revision" argument should (just as it's done in GitHub) better be used to load model checkpoints from a specific commit or branch in development.
-
-The above example is therefore deprecated and won't be supported anymore for `diffusers >= 1.0.0`.
-
-<Tip warning={true}>
-
-If you load diffusers pipelines or models with `revision="fp16"` or `revision="non_ema"`,
-please make sure to update the code and use `variant="fp16"` or `variation="non_ema"` respectively
-instead.
-
-</Tip>
-->
-
-## Models
-
-Models are loaded from the [`ModelMixin.from_pretrained`] method, which downloads and caches the latest version of the model weights and configurations. If the latest files are available in the local cache, [`~ModelMixin.from_pretrained`] reuses files in the cache instead of re-downloading them.
-
-Models can be loaded from a subfolder with the `subfolder` argument. For example, the model weights for `runwayml/stable-diffusion-v1-5` are stored in the [`unet`](https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/main/unet) subfolder:
-
-```python
-from diffusers import UNet2DConditionModel
-
-repo_id = "runwayml/stable-diffusion-v1-5"
-model = UNet2DConditionModel.from_pretrained(repo_id, subfolder="unet", use_safetensors=True)
-```
-
-Or directly from a repository's [directory](https://huggingface.co/google/ddpm-cifar10-32/tree/main):
-
-```python
-from diffusers import UNet2DModel
-
-repo_id = "google/ddpm-cifar10-32"
-model = UNet2DModel.from_pretrained(repo_id, use_safetensors=True)
-```
-
-You can also load and save model variants by specifying the `variant` argument in [`ModelMixin.from_pretrained`] and [`ModelMixin.save_pretrained`]:
-
-```python
-from diffusers import UNet2DConditionModel
-
-model = UNet2DConditionModel.from_pretrained(
-    "runwayml/stable-diffusion-v1-5", subfolder="unet", variant="non_ema", use_safetensors=True
-)
-model.save_pretrained("./local-unet", variant="non_ema")
-```
-
-## Schedulers
-
-Schedulers are loaded from the [`SchedulerMixin.from_pretrained`] method, and unlike models, schedulers are **not parameterized** or **trained**; they are defined by a configuration file.
-
-Loading schedulers does not consume any significant amount of memory and the same configuration file can be used for a variety of different schedulers.
-For example, the following schedulers are compatible with [`StableDiffusionPipeline`], which means you can load the same scheduler configuration file in any of these classes:
-
-```python
-from diffusers import StableDiffusionPipeline
-from diffusers import (
-    DDPMScheduler,
-    DDIMScheduler,
-    PNDMScheduler,
-    LMSDiscreteScheduler,
-    EulerAncestralDiscreteScheduler,
-    EulerDiscreteScheduler,
-    DPMSolverMultistepScheduler,
-)
-
-repo_id = "runwayml/stable-diffusion-v1-5"
-
-ddpm = DDPMScheduler.from_pretrained(repo_id, subfolder="scheduler")
-ddim = DDIMScheduler.from_pretrained(repo_id, subfolder="scheduler")
-pndm = PNDMScheduler.from_pretrained(repo_id, subfolder="scheduler")
-lms = LMSDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler")
-euler_anc = EulerAncestralDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler")
-euler = EulerDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler")
-dpm = DPMSolverMultistepScheduler.from_pretrained(repo_id, subfolder="scheduler")
-
-# replace `dpm` with any of `ddpm`, `ddim`, `pndm`, `lms`, `euler_anc`, `euler`
-pipeline = StableDiffusionPipeline.from_pretrained(repo_id, scheduler=dpm, use_safetensors=True)
-```
-
 ## DiffusionPipeline explained

 As a class method, [`DiffusionPipeline.from_pretrained`] is responsible for two things:
--- a/docs/source/en/using-diffusers/loading_overview.md
+++ b/docs/source/en/using-diffusers/loading_overview.md
@@ -1,17 +0,0 @@
-<!--Copyright 2024 The HuggingFace Team. All rights reserved.
-
-Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
-the License. You may obtain a copy of the License at
-
-http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
-an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
-specific language governing permissions and limitations under the License.
-->
-
-# Overview
-
-🧨 Diffusers offers many pipelines, models, and schedulers for generative tasks. To make loading these components as simple as possible, we provide a single and unified method - `from_pretrained()` - that loads any of these components from either the Hugging Face [Hub](https://huggingface.co/models?library=diffusers&sort=downloads) or your local machine. Whenever you load a pipeline or model, the latest files are automatically downloaded and cached so you can quickly reuse them next time without redownloading the files.
-
-This section will show you everything you need to know about loading pipelines, how to load different components in a pipeline, how to load checkpoint variants, and how to load community pipelines. You'll also learn how to load schedulers and compare the speed and quality trade-offs of using different schedulers. Finally, you'll see how to convert and load KerasCV checkpoints so you can use them in PyTorch with 🧨 Diffusers.
--- a/docs/source/en/using-diffusers/schedulers.md
+++ b/docs/source/en/using-diffusers/schedulers.md
@@ -10,57 +10,27 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
 specific language governing permissions and limitations under the License.
 -->

-# Schedulers
+# Load schedulers and models

 [[open-in-colab]]

-Diffusion pipelines are inherently a collection of diffusion models and schedulers that are partly independent from each other. This means that one is able to switch out parts of the pipeline to better customize
-a pipeline to one's use case. The best example of this is the [Schedulers](../api/schedulers/overview).
+Diffusion pipelines are a collection of interchangeable schedulers and models that can be mixed and matched to tailor a pipeline to a specific use case. The scheduler encapsulates the entire denoising process such as the number of denoising steps and the algorithm for finding the denoised sample. A scheduler is not parameterized or trained so they don't take very much memory. The model is usually only concerned with the forward pass of going from a noisy input to a less noisy sample.

-Whereas diffusion models usually simply define the forward pass from noise to a less noisy sample,
-schedulers define the whole denoising process, *i.e.*:
- How many denoising steps?
- Stochastic or deterministic?
- What algorithm to use to find the denoised sample?
+This guide will show you how to load schedulers and models to customize a pipeline. You'll use the [runwayml/stable-diffusion-v1-5](https://hf.co/runwayml/stable-diffusion-v1-5) checkpoint throughout this guide, so let's load it first.

-They can be quite complex and often define a trade-off between **denoising speed** and **denoising quality**.
-It is extremely difficult to measure quantitatively which scheduler works best for a given diffusion pipeline, so it is often recommended to simply try out which works best.
-
-The following paragraphs show how to do so with the 🧨 Diffusers library.
-
-## Load pipeline
-
-Let's start by loading the [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5) model in the [`DiffusionPipeline`]:
-
-```python
-from huggingface_hub import login
-from diffusers import DiffusionPipeline
+```py
 import torch
-
-login()
+from diffusers import DiffusionPipeline

 pipeline = DiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True
-)
+).to("cuda")
 ```

-Next, we move it to GPU:
+You can see what scheduler this pipeline uses with the `pipeline.scheduler` attribute.

-```python
-pipeline.to("cuda")
-```
-
-## Access the scheduler
-
-The scheduler is always one of the components of the pipeline and is usually called `"scheduler"`.
-So it can be accessed via the `"scheduler"` property.
-
-```python
+```py
 pipeline.scheduler
-```
-
-**Output**:
-```
 PNDMScheduler {
  "_class_name": "PNDMScheduler",
  "_diffusers_version": "0.21.4",
@@ -77,235 +47,156 @@ PNDMScheduler {
 }
 ```

-We can see that the scheduler is of type [`PNDMScheduler`].
-Cool, now let's compare the scheduler in its performance to other schedulers.
-First we define a prompt on which we will test all the different schedulers:
+## Load a scheduler

-```python
-prompt = "A photograph of an astronaut riding a horse on Mars, high resolution, high definition."
-```
+Schedulers are defined by a configuration file that can be used by a variety of schedulers. Load a scheduler with the [`SchedulerMixin.from_pretrained`] method, and specify the `subfolder` parameter to load the configuration file into the correct subfolder of the pipeline repository.

-Next, we create a generator from a random seed that will ensure that we can generate similar images as well as run the pipeline:
+For example, to load the [`DDIMScheduler`]:

-```python
-generator = torch.Generator(device="cuda").manual_seed(8)
-image = pipeline(prompt, generator=generator).images[0]
-image
-```
-
-<p align="center">
-    <br>
-    <img src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/diffusers_docs/astronaut_pndm.png" width="400"/>
-    <br>
-</p>
-
-
-## Changing the scheduler
-
-Now we show how easy it is to change the scheduler of a pipeline. Every scheduler has a property [`~SchedulerMixin.compatibles`]
-which defines all compatible schedulers. You can take a look at all available, compatible schedulers for the Stable Diffusion pipeline as follows.
-
-```python
-pipeline.scheduler.compatibles
-```
-
-**Output**:
-```
-[diffusers.utils.dummy_torch_and_torchsde_objects.DPMSolverSDEScheduler,
- diffusers.schedulers.scheduling_euler_discrete.EulerDiscreteScheduler,
- diffusers.schedulers.scheduling_lms_discrete.LMSDiscreteScheduler,
- diffusers.schedulers.scheduling_ddim.DDIMScheduler,
- diffusers.schedulers.scheduling_ddpm.DDPMScheduler,
- diffusers.schedulers.scheduling_heun_discrete.HeunDiscreteScheduler,
- diffusers.schedulers.scheduling_dpmsolver_multistep.DPMSolverMultistepScheduler,
- diffusers.schedulers.scheduling_deis_multistep.DEISMultistepScheduler,
- diffusers.schedulers.scheduling_pndm.PNDMScheduler,
- diffusers.schedulers.scheduling_euler_ancestral_discrete.EulerAncestralDiscreteScheduler,
- diffusers.schedulers.scheduling_unipc_multistep.UniPCMultistepScheduler,
- diffusers.schedulers.scheduling_k_dpm_2_discrete.KDPM2DiscreteScheduler,
- diffusers.schedulers.scheduling_dpmsolver_singlestep.DPMSolverSinglestepScheduler,
- diffusers.schedulers.scheduling_k_dpm_2_ancestral_discrete.KDPM2AncestralDiscreteScheduler]
-```
-
-Cool, lots of schedulers to look at. Feel free to have a look at their respective class definitions:
-
- [`EulerDiscreteScheduler`],
- [`LMSDiscreteScheduler`],
- [`DDIMScheduler`],
- [`DDPMScheduler`],
- [`HeunDiscreteScheduler`],
- [`DPMSolverMultistepScheduler`],
- [`DEISMultistepScheduler`],
- [`PNDMScheduler`],
- [`EulerAncestralDiscreteScheduler`],
- [`UniPCMultistepScheduler`],
- [`KDPM2DiscreteScheduler`],
- [`DPMSolverSinglestepScheduler`],
- [`KDPM2AncestralDiscreteScheduler`].
-
-We will now compare the input prompt with all other schedulers. To change the scheduler of the pipeline you can make use of the
-convenient [`~ConfigMixin.config`] property in combination with the [`~ConfigMixin.from_config`] function.
-
-```python
-pipeline.scheduler.config
-```
-
-returns a dictionary of the configuration of the scheduler:
-
-**Output**:
 ```py
-FrozenDict([('num_train_timesteps', 1000),
-            ('beta_start', 0.00085),
-            ('beta_end', 0.012),
-            ('beta_schedule', 'scaled_linear'),
-            ('trained_betas', None),
-            ('skip_prk_steps', True),
-            ('set_alpha_to_one', False),
-            ('prediction_type', 'epsilon'),
-            ('timestep_spacing', 'leading'),
-            ('steps_offset', 1),
-            ('_use_default_values', ['timestep_spacing', 'prediction_type']),
-            ('_class_name', 'PNDMScheduler'),
-            ('_diffusers_version', '0.21.4'),
-            ('clip_sample', False)])
+from diffusers import DDIMScheduler, DiffusionPipeline
+
+ddim = DDIMScheduler.from_pretrained("runwayml/stable-diffusion-v1-5", subfolder="scheduler")
 ```

-This configuration can then be used to instantiate a scheduler
-of a different class that is compatible with the pipeline. Here,
-we change the scheduler to the [`DDIMScheduler`].
+Then you can pass the newly loaded scheduler to the pipeline.

 ```python
-from diffusers import DDIMScheduler
-
-pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)
+pipeline = DiffusionPipeline.from_pretrained(
+    "runwayml/stable-diffusion-v1-5", scheduler=ddim, torch_dtype=torch.float16, use_safetensors=True
+).to("cuda")
 ```

-Cool, now we can run the pipeline again to compare the generation quality.
-
-```python
-generator = torch.Generator(device="cuda").manual_seed(8)
-image = pipeline(prompt, generator=generator).images[0]
-image
-```
-
-<p align="center">
-    <br>
-    <img src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/diffusers_docs/astronaut_ddim.png" width="400"/>
-    <br>
-</p>
-
-If you are a JAX/Flax user, please check [this section](#changing-the-scheduler-in-flax) instead.
-
 ## Compare schedulers

-So far we have tried running the stable diffusion pipeline with two schedulers: [`PNDMScheduler`] and [`DDIMScheduler`].
-A number of better schedulers have been released that can be run with much fewer steps; let's compare them here:
+Schedulers have their own unique strengths and weaknesses, making it difficult to quantitatively compare which scheduler works best for a pipeline. You typically have to make a trade-off between denoising speed and denoising quality. We recommend trying out different schedulers to find one that works best for your use case. Call the `pipeline.scheduler.compatibles` attribute to see what schedulers are compatible with a pipeline.

-[`LMSDiscreteScheduler`] usually leads to better results:
+Let's compare the [`LMSDiscreteScheduler`], [`EulerDiscreteScheduler`], [`EulerAncestralDiscreteScheduler`], and the [`DPMSolverMultistepScheduler`] on the following prompt and seed.

-```python
+```py
+import torch
+from diffusers import DiffusionPipeline
+
+pipeline = DiffusionPipeline.from_pretrained(
+    "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True
+).to("cuda")
+
+prompt = "A photograph of an astronaut riding a horse on Mars, high resolution, high definition."
+generator = torch.Generator(device="cuda").manual_seed(8)
+```
+
+To change the pipelines scheduler, use the [`~ConfigMixin.from_config`] method to load a different scheduler's `pipeline.scheduler.config` into the pipeline.
+
+<hfoptions id="schedulers">
+<hfoption id="LMSDiscreteScheduler">
+
+[`LMSDiscreteScheduler`] typically generates higher quality images than the default scheduler.
+
+```py
 from diffusers import LMSDiscreteScheduler

 pipeline.scheduler = LMSDiscreteScheduler.from_config(pipeline.scheduler.config)
-
-generator = torch.Generator(device="cuda").manual_seed(8)
 image = pipeline(prompt, generator=generator).images[0]
 image
 ```

-<p align="center">
-    <br>
-    <img src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/diffusers_docs/astronaut_lms.png" width="400"/>
-    <br>
-</p>
+</hfoption>
+<hfoption id="EulerDiscreteScheduler">

+[`EulerDiscreteScheduler`] can generate higher quality images in just 30 steps.

-[`EulerDiscreteScheduler`] and [`EulerAncestralDiscreteScheduler`] can generate high quality results with as little as 30 steps.
-
-```python
+```py
 from diffusers import EulerDiscreteScheduler

 pipeline.scheduler = EulerDiscreteScheduler.from_config(pipeline.scheduler.config)
-
-generator = torch.Generator(device="cuda").manual_seed(8)
-image = pipeline(prompt, generator=generator, num_inference_steps=30).images[0]
+image = pipeline(prompt, generator=generator).images[0]
 image
 ```

-<p align="center">
-    <br>
-    <img src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/diffusers_docs/astronaut_euler_discrete.png" width="400"/>
-    <br>
-</p>
+</hfoption>
+<hfoption id="EulerAncestralDiscreteScheduler">

+[`EulerAncestralDiscreteScheduler`] can generate higher quality images in just 30 steps.

-and:
-
-```python
+```py
 from diffusers import EulerAncestralDiscreteScheduler

 pipeline.scheduler = EulerAncestralDiscreteScheduler.from_config(pipeline.scheduler.config)
-
-generator = torch.Generator(device="cuda").manual_seed(8)
-image = pipeline(prompt, generator=generator, num_inference_steps=30).images[0]
+image = pipeline(prompt, generator=generator).images[0]
 image
 ```

-<p align="center">
-    <br>
-    <img src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/diffusers_docs/astronaut_euler_ancestral.png" width="400"/>
-    <br>
-</p>
+</hfoption>
+<hfoption id="DPMSolverMultistepScheduler">

+[`DPMSolverMultistepScheduler`] provides a balance between speed and quality and can generate higher quality images in just 20 steps.

-[`DPMSolverMultistepScheduler`] gives a reasonable speed/quality trade-off and can be run with as little as 20 steps.
-
-```python
+```py
 from diffusers import DPMSolverMultistepScheduler

 pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config)
-
-generator = torch.Generator(device="cuda").manual_seed(8)
-image = pipeline(prompt, generator=generator, num_inference_steps=20).images[0]
+image = pipeline(prompt, generator=generator).images[0]
 image
 ```

-<p align="center">
-    <br>
-    <img src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/diffusers_docs/astronaut_dpm.png" width="400"/>
-    <br>
-</p>
+</hfoption>
+</hfoptions>

-As you can see, most images look very similar and are arguably of very similar quality. It often really depends on the specific use case which scheduler to choose. A good approach is always to run multiple different
-schedulers to compare results.
+<div class="flex gap-4">
+  <div>
+    <img class="rounded-xl" src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/diffusers_docs/astronaut_lms.png" />
+    <figcaption class="mt-2 text-center text-sm text-gray-500">LMSDiscreteScheduler</figcaption>
+  </div>
+  <div>
+    <img class="rounded-xl" src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/diffusers_docs/astronaut_euler_discrete.png" />
+    <figcaption class="mt-2 text-center text-sm text-gray-500">EulerDiscreteScheduler</figcaption>
+  </div>
+</div>
+<div class="flex gap-4">
+  <div>
+    <img class="rounded-xl" src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/diffusers_docs/astronaut_euler_ancestral.png" />
+    <figcaption class="mt-2 text-center text-sm text-gray-500">EulerAncestralDiscreteScheduler</figcaption>
+  </div>
+  <div>
+    <img class="rounded-xl" src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/diffusers_docs/astronaut_dpm.png" />
+    <figcaption class="mt-2 text-center text-sm text-gray-500">DPMSolverMultistepScheduler</figcaption>
+  </div>
+</div>

-## Changing the Scheduler in Flax
+Most images look very similar and are comparable in quality. Again, it often comes down to your specific use case so a good approach is to run multiple different schedulers and compare the results.

-If you are a JAX/Flax user, you can also change the default pipeline scheduler. This is a complete example of how to run inference using the Flax Stable Diffusion pipeline and the super-fast [DPM-Solver++ scheduler](../api/schedulers/multistep_dpm_solver):
+### Flax schedulers

-```Python
+To compare Flax schedulers, you need to additionally load the scheduler state into the model parameters. For example, let's change the default scheduler in [`FlaxStableDiffusionPipeline`] to use the super fast [`FlaxDPMSolverMultistepScheduler`].
+
+> [!WARNING]
+> The [`FlaxLMSDiscreteScheduler`] and [`FlaxDDPMScheduler`] are not compatible with the [`FlaxStableDiffusionPipeline`] yet.
+
+```py
 import jax
 import numpy as np
 from flax.jax_utils import replicate
 from flax.training.common_utils import shard
-
 from diffusers import FlaxStableDiffusionPipeline, FlaxDPMSolverMultistepScheduler

-model_id = "runwayml/stable-diffusion-v1-5"
 scheduler, scheduler_state = FlaxDPMSolverMultistepScheduler.from_pretrained(
-    model_id,
+    "runwayml/stable-diffusion-v1-5",
    subfolder="scheduler"
 )
 pipeline, params = FlaxStableDiffusionPipeline.from_pretrained(
-    model_id,
+    "runwayml/stable-diffusion-v1-5",
    scheduler=scheduler,
    revision="bf16",
    dtype=jax.numpy.bfloat16,
 )
 params["scheduler"] = scheduler_state
+```

+Then you can take advantage of Flax's compatibility with TPUs to generate a number of images in parallel. You'll need to make a copy of the model parameters for each available device and then split the inputs across them to generate your desired number of images.
+
+```py
 # Generate 1 image per parallel device (8 on TPUv2-8 or TPUv3-8)
-prompt = "a photo of an astronaut riding a horse on mars"
+prompt = "A photograph of an astronaut riding a horse on Mars, high resolution, high definition."
 num_samples = jax.device_count()
 prompt_ids = pipeline.prepare_inputs([prompt] * num_samples)

@@ -321,11 +212,33 @@ images = pipeline(prompt_ids, params, prng_seed, num_inference_steps, jit=True).
 images = pipeline.numpy_to_pil(np.asarray(images.reshape((num_samples,) + images.shape[-3:])))
 ```

-<Tip warning={true}>
+## Models

-The following Flax schedulers are _not yet compatible_ with the Flax Stable Diffusion Pipeline:
+Models are loaded from the [`ModelMixin.from_pretrained`] method, which downloads and caches the latest version of the model weights and configurations. If the latest files are available in the local cache, [`~ModelMixin.from_pretrained`] reuses files in the cache instead of re-downloading them.

- `FlaxLMSDiscreteScheduler`
- `FlaxDDPMScheduler`
+Models can be loaded from a subfolder with the `subfolder` argument. For example, the model weights for [runwayml/stable-diffusion-v1-5](https://hf.co/runwayml/stable-diffusion-v1-5) are stored in the [unet](https://hf.co/runwayml/stable-diffusion-v1-5/tree/main/unet) subfolder.

-</Tip>
+```python
+from diffusers import UNet2DConditionModel
+
+unet = UNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5", subfolder="unet", use_safetensors=True)
+```
+
+They can also be directly loaded from a [repository](https://huggingface.co/google/ddpm-cifar10-32/tree/main).
+
+```python
+from diffusers import UNet2DModel
+
+unet = UNet2DModel.from_pretrained("google/ddpm-cifar10-32", use_safetensors=True)
+```
+
+To load and save model variants, specify the `variant` argument in [`ModelMixin.from_pretrained`] and [`ModelMixin.save_pretrained`].
+
+```python
+from diffusers import UNet2DConditionModel
+
+unet = UNet2DConditionModel.from_pretrained(
+    "runwayml/stable-diffusion-v1-5", subfolder="unet", variant="non_ema", use_safetensors=True
+)
+unet.save_pretrained("./local-unet", variant="non_ema")
+```