1
0
mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00
Commit Graph

1045 Commits

Author SHA1 Message Date
Edna
8adc6003ba Chroma Pipeline (#11698)
* working state from hameerabbasi and iddl

* working state form hameerabbasi and iddl (transformer)

* working state (normalization)

* working state (embeddings)

* add chroma loader

* add chroma to mappings

* add chroma to transformer init

* take out variant stuff

* get decently far in changing variant stuff

* add chroma init

* make chroma output class

* add chroma transformer to dummy tp

* add chroma to init

* add chroma to init

* fix single file

* update

* update

* add chroma to auto pipeline

* add chroma to pipeline init

* change to chroma transformer

* take out variant from blocks

* swap embedder location

* remove prompt_2

* work on swapping text encoders

* remove mask function

* dont modify mask (for now)

* wrap attn mask

* no attn mask (can't get it to work)

* remove pooled prompt embeds

* change to my own unpooled embeddeer

* fix load

* take pooled projections out of transformer

* ensure correct dtype for chroma embeddings

* update

* use dn6 attn mask + fix true_cfg_scale

* use chroma pipeline output

* use DN6 embeddings

* remove guidance

* remove guidance embed (pipeline)

* remove guidance from embeddings

* don't return length

* dont change dtype

* remove unused stuff, fix up docs

* add chroma autodoc

* add .md (oops)

* initial chroma docs

* undo don't change dtype

* undo arxiv change

unsure why that happened

* fix hf papers regression in more places

* Update docs/source/en/api/pipelines/chroma.md

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* do_cfg -> self.do_classifier_free_guidance

* Update docs/source/en/api/models/chroma_transformer.md

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Update chroma.md

* Move chroma layers into transformer

* Remove pruned AdaLayerNorms

* Add chroma fast tests

* (untested) batch cond and uncond

* Add # Copied from for shift

* Update # Copied from statements

* update norm imports

* Revert cond + uncond batching

* Add transformer tests

* move chroma test (oops)

* chroma init

* fix chroma pipeline fast tests

* Update src/diffusers/models/transformers/transformer_chroma.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Move Approximator and Embeddings

* Fix auto pipeline + make style, quality

* make style

* Apply style fixes

* switch to new input ids

* fix # Copied from error

* remove # Copied from on protected members

* try to fix import

* fix import

* make fix-copes

* revert style fix

* update chroma transformer params

* update chroma transformer approximator init params

* update to pad tokens

* fix batch inference

* Make more pipeline tests work

* Make most transformer tests work

* fix docs

* make style, make quality

* skip batch tests

* fix test skipping

* fix test skipping again

* fix for tests

* Fix all pipeline test

* update

* push local changes, fix docs

* add encoder test, remove pooled dim

* default proj dim

* fix tests

* fix equal size list input

* update

* push local changes, fix docs

* add encoder test, remove pooled dim

* default proj dim

* fix tests

* fix equal size list input

* Revert "fix equal size list input"

This reverts commit 3fe4ad67d5.

* update

* update

* update

* update

* update

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-06-14 06:52:56 +05:30
Aryan
9f91305f85 Cosmos Predict2 (#11695)
* support text-to-image

* update example

* make fix-copies

* support use_flow_sigmas in EDM scheduler instead of maintain cosmos-specific scheduler

* support video-to-world

* update

* rename text2image pipeline

* make fix-copies

* add t2i test

* add test for v2w pipeline

* support edm dpmsolver multistep

* update

* update

* update

* update tests

* fix tests

* safety checker

* make conversion script work without guardrail
2025-06-14 01:51:29 +05:30
Sayak Paul
62cbde8d41 [docs] mention fp8 benefits on supported hardware. (#11699)
* mention fp8 benefits on supported hardware.

* Update docs/source/en/quantization/torchao.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-06-13 07:17:03 +05:30
Sayak Paul
00b179fb1a [docs] add compilation bits to the bitsandbytes docs. (#11693)
* add compilation bits to the bitsandbytes docs.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* finish

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-06-12 08:49:24 +05:30
Aryan
73a9d5856f Wan VACE (#11582)
* initial support

* make fix-copies

* fix no split modules

* add conversion script

* refactor

* add pipeline test

* refactor

* fix bug with mask

* fix for reference images

* remove print

* update docs

* update slices

* update

* update

* update example
2025-06-06 17:53:10 +05:30
Steven Liu
c934720629 [docs] Model cards (#11112)
* initial

* update

* hunyuanvideo

* ltx

* fix

* wan

* gen guide

* feedback

* feedback

* pipeline-level quant config

* feedback

* ltx
2025-06-02 16:55:14 -07:00
Steven Liu
9f48394bf7 [docs] Caching methods (#11625)
* cache

* feedback
2025-06-02 10:58:47 -07:00
Sayak Paul
b975bceff3 [docs] update torchao doc link (#11634)
update torchao doc link
2025-05-30 08:30:36 -07:00
VLT Media
d0ec6601df Bug: Fixed Image 2 Image example (#11619)
Bug: Fixed Image 2 Image example where a PIL.Image was improperly being asked for an item via index.
2025-05-30 11:30:52 +05:30
Steven Liu
be2fb77dc1 [docs] PyTorch 2.0 (#11618)
* combine

* Update docs/source/en/optimization/fp16.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-05-28 09:42:41 -07:00
Linoy Tsaban
28ef0165b9 [Sana Sprint] add image-to-image pipeline (#11602)
* sana sprint img2img

* fix import

* fix name

* fix image encoding

* fix image encoding

* fix image encoding

* fix image encoding

* fix image encoding

* fix image encoding

* try w/o strength

* try scaling differently

* try with strength

* revert unnecessary changes to scheduler

* revert unnecessary changes to scheduler

* Apply style fixes

* remove comment

* add copy statements

* add copy statements

* add to doc

* add to doc

* add to doc

* add to doc

* Apply style fixes

* empty commit

* fix copies

* fix copies

* fix copies

* fix copies

* fix copies

* docs

* make fix-copies.

* fix doc building error.

* initial commit - add img2img test

* initial commit - add img2img test

* fix import

* fix imports

* Apply style fixes

* empty commit

* remove

* empty commit

* test vocab size

* fix

* fix prompt missing from last commits

* small changes

* fix image processing when input is tensor

* fix order

* Apply style fixes

* empty commit

* fix shape

* remove comment

* image processing

* remove comment

* skip vae tiling test for now

* Apply style fixes

* empty commit

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
2025-05-27 22:09:51 +03:00
Steven Liu
7ae546f8d1 [docs] Pipeline-level quantization (#11604)
refactor
2025-05-26 14:12:57 +05:30
regisss
f161e277d0 Update Intel Gaudi doc (#11479)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-05-23 10:41:49 +02:00
Steven Liu
23a4ff8488 [docs] Remove fast diffusion tutorial (#11583)
remove tutorial

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-05-20 08:56:12 -07:00
osrm
8705af0914 docs: fix invalid links (#11505)
* fix invalid link lora.md

* fix invalid link controlnet_sdxl.md

The Hugging Face models page now uses the tags parameter instead of the other parameter for tag-based filtering. Therefore, to simultaneously apply both the "Stable Diffusion XL" and "ControlNet" tags, the following URL should be used: https://huggingface.co/models?tags=stable-diffusion-xl,controlnet

* fix invalid link cosine_dpm.md

"https://github.com/Stability-AI/stable-audio-tool" -> "https://github.com/Stability-AI/stable-audio-tools"

* Update controlnet_sdxl.md

* Update cosine_dpm.md

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-05-20 08:55:41 -07:00
Aryan
05c8b42b75 LTX 0.9.7-distilled; documentation improvements (#11571)
* add guidance rescale

* update docs

* support adaptive instance norm filter

* fix custom timesteps support

* add custom timestep example to docs

* add a note about best generation settings being available only in the original repository

* use original org hub ids instead of personal

* make fix-copies

---------

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2025-05-20 02:29:16 +05:30
Quentin Gallouédec
c8bb1ff53e Use HF Papers (#11567)
* Use HF Papers

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-05-19 06:22:33 -10:00
Sayak Paul
6918f6d19a [docs] tip for group offloding + quantization (#11576)
* tip for group offloding + quantization

Co-authored-by: Aryan VS <contact.aryanvs@gmail.com>

* Apply suggestions from code review

Co-authored-by: Aryan <aryan@huggingface.co>

---------

Co-authored-by: Aryan VS <contact.aryanvs@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>
2025-05-19 14:49:15 +05:30
space_samurai
8270fa58e4 Doc update (#11531)
Update docs/source/en/using-diffusers/inpaint.md
2025-05-19 13:32:08 +05:30
Sayak Paul
9836f0e000 [docs] Regional compilation docs (#11556)
* add regional compilation docs.

* minor.

* reviwer feedback.

* Update docs/source/en/optimization/torch2.0.md

Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>

---------

Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
2025-05-15 19:11:24 +05:30
Dhruv Nair
4267d8f4eb [Single File] GGUF/Single File Support for HiDream (#11550)
* update

* update

* update

* update

* update

* update

* update
2025-05-15 12:25:18 +05:30
Aryan
06fee551e9 LTX Video 0.9.7 (#11516)
* add upsampling pipeline

* ltx upsample pipeline conversion; pipeline fixes

* make fix-copies

* remove print

* add vae convenience methods

* update

* add tests

* support denoising strength for upscaling & video-to-video

* update docs

* update doc checkpoints

* update docs

* fix

---------

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2025-05-13 14:57:03 +05:30
Zhong-Yu Li
4f438de35a Add VisualCloze (#11377)
* VisualCloze

* style quality

* add docs

* add docs

* typo

* Update docs/source/en/api/pipelines/visualcloze.md

* delete einops

* style quality

* Update src/diffusers/pipelines/visualcloze/pipeline_visualcloze.py

* reorg

* refine doc

* style quality

* typo

* typo

* Update src/diffusers/image_processor.py

* add comment

* test

* style

* Modified based on review

* style

* restore image_processor

* update example url

* style

* fix-copies

* VisualClozeGenerationPipeline

* combine

* tests docs

* remove VisualClozeUpsamplingPipeline

* style

* quality

* test examples

* quality style

* typo

* make fix-copies

* fix test_callback_cfg and test_save_load_dduf in VisualClozePipelineFastTests

* add EXAMPLE_DOC_STRING to VisualClozeGenerationPipeline

* delete maybe_free_model_hooks from pipeline_visualcloze_combined

* Apply suggestions from code review

* fix test_save_load_local test; add reason for skipping cfg test

* more save_load test fixes

* fix tests in generation pipeline tests
2025-05-13 02:46:51 +05:30
Aryan
e48f6aeeb4 Hunyuan Video Framepack F1 (#11534)
* support framepack f1

* update docs

* update toctree

* remove typo
2025-05-12 16:11:10 +05:30
Sayak Paul
599c887164 feat: pipeline-level quantization config (#11130)
* feat: pipeline-level quant config.

Co-authored-by: SunMarc <marc.sun@hotmail.fr>

condition better.

support mapping.

improvements.

[Quantization] Add Quanto backend (#10756)

* update

* updaet

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update docs/source/en/quantization/quanto.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update src/diffusers/quantizers/quanto/utils.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

[Single File] Add single file loading for SANA Transformer (#10947)

* added support for from_single_file

* added diffusers mapping script

* added testcase

* bug fix

* updated tests

* corrected code quality

* corrected code quality

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

[LoRA] Improve warning messages when LoRA loading becomes a no-op (#10187)

* updates

* updates

* updates

* updates

* notebooks revert

* fix-copies.

* seeing

* fix

* revert

* fixes

* fixes

* fixes

* remove print

* fix

* conflicts ii.

* updates

* fixes

* better filtering of prefix.

---------

Co-authored-by: hlky <hlky@hlky.ac>

[LoRA] CogView4 (#10981)

* update

* make fix-copies

* update

[Tests] improve quantization tests by additionally measuring the inference memory savings (#11021)

* memory usage tests

* fixes

* gguf

[`Research Project`] Add AnyText: Multilingual Visual Text Generation And Editing (#8998)

* Add initial template

* Second template

* feat: Add TextEmbeddingModule to AnyTextPipeline

* feat: Add AuxiliaryLatentModule template to AnyTextPipeline

* Add bert tokenizer from the anytext repo for now

* feat: Update AnyTextPipeline's modify_prompt method

This commit adds improvements to the modify_prompt method in the AnyTextPipeline class. The method now handles special characters and replaces selected string prompts with a placeholder. Additionally, it includes a check for Chinese text and translation using the trans_pipe.

* Fill in the `forward` pass of `AuxiliaryLatentModule`

* `make style && make quality`

* `chore: Update bert_tokenizer.py with a TODO comment suggesting the use of the transformers library`

* Update error handling to raise and logging

* Add `create_glyph_lines` function into `TextEmbeddingModule`

* make style

* Up

* Up

* Up

* Up

* Remove several comments

* refactor: Remove ControlNetConditioningEmbedding and update code accordingly

* Up

* Up

* up

* refactor: Update AnyTextPipeline to include new optional parameters

* up

* feat: Add OCR model and its components

* chore: Update `TextEmbeddingModule` to include OCR model components and dependencies

* chore: Update `AuxiliaryLatentModule` to include VAE model and its dependencies for masked image in the editing task

* `make style`

* refactor: Update `AnyTextPipeline`'s docstring

* Update `AuxiliaryLatentModule` to include info dictionary so that text processing is done once

* simplify

* `make style`

* Converting `TextEmbeddingModule` to ordinary `encode_prompt()` function

* Simplify for now

* `make style`

* Up

* feat: Add scripts to convert AnyText controlnet to diffusers

* `make style`

* Fix: Move glyph rendering to `TextEmbeddingModule` from `AuxiliaryLatentModule`

* make style

* Up

* Simplify

* Up

* feat: Add safetensors module for loading model file

* Fix device issues

* Up

* Up

* refactor: Simplify

* refactor: Simplify code for loading models and handling data types

* `make style`

* refactor: Update to() method in FrozenCLIPEmbedderT3 and TextEmbeddingModule

* refactor: Update dtype in embedding_manager.py to match proj.weight

* Up

* Add attribution and adaptation information to pipeline_anytext.py

* Update usage example

* Will refactor `controlnet_cond_embedding` initialization

* Add `AnyTextControlNetConditioningEmbedding` template

* Refactor organization

* style

* style

* Move custom blocks from `AuxiliaryLatentModule` to `AnyTextControlNetConditioningEmbedding`

* Follow one-file policy

* style

* [Docs] Update README and pipeline_anytext.py to use AnyTextControlNetModel

* [Docs] Update import statement for AnyTextControlNetModel in pipeline_anytext.py

* [Fix] Update import path for ControlNetModel, ControlNetOutput in anytext_controlnet.py

* Refactor AnyTextControlNet to use configurable conditioning embedding channels

* Complete control net conditioning embedding in AnyTextControlNetModel

* up

* [FIX] Ensure embeddings use correct device in AnyTextControlNetModel

* up

* up

* style

* [UPDATE] Revise README and example code for AnyTextPipeline integration with DiffusionPipeline

* [UPDATE] Update example code in anytext.py to use correct font file and improve clarity

* down

* [UPDATE] Refactor BasicTokenizer usage to a new Checker class for text processing

* update pillow

* [UPDATE] Remove commented-out code and unnecessary docstring in anytext.py and anytext_controlnet.py for improved clarity

* [REMOVE] Delete frozen_clip_embedder_t3.py as it is in the anytext.py file

* [UPDATE] Replace edict with dict for configuration in anytext.py and RecModel.py for consistency

* 🆙

* style

* [UPDATE] Revise README.md for clarity, remove unused imports in anytext.py, and add author credits in anytext_controlnet.py

* style

* Update examples/research_projects/anytext/README.md

Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Remove commented-out image preparation code in AnyTextPipeline

* Remove unnecessary blank line in README.md

[Quantization] Allow loading TorchAO serialized Tensor objects with torch>=2.6  (#11018)

* update

* update

* update

* update

* update

* update

* update

* update

* update

fix: mixture tiling sdxl pipeline - adjust gerating time_ids & embeddings  (#11012)

small fix on generating time_ids & embeddings

[LoRA] support wan i2v loras from the world. (#11025)

* support wan i2v loras from the world.

* remove copied from.

* upates

* add lora.

Fix SD3 IPAdapter feature extractor (#11027)

chore: fix help messages in advanced diffusion examples (#10923)

Fix missing **kwargs in lora_pipeline.py (#11011)

* Update lora_pipeline.py

* Apply style fixes

* fix-copies

---------

Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Fix for multi-GPU WAN inference (#10997)

Ensure that hidden_state and shift/scale are on the same device when running with multiple GPUs

Co-authored-by: Jimmy <39@🇺🇸.com>

[Refactor] Clean up import utils boilerplate (#11026)

* update

* update

* update

Use `output_size` in `repeat_interleave` (#11030)

[hybrid inference 🍯🐝] Add VAE encode (#11017)

* [hybrid inference 🍯🐝] Add VAE encode

* _toctree: add vae encode

* Add endpoints, tests

* vae_encode docs

* vae encode benchmarks

* api reference

* changelog

* Update docs/source/en/hybrid_inference/overview.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

Wan Pipeline scaling fix, type hint warning, multi generator fix (#11007)

* Wan Pipeline scaling fix, type hint warning, multi generator fix

* Apply suggestions from code review

[LoRA] change to warning from info when notifying the users about a LoRA no-op (#11044)

* move to warning.

* test related changes.

Rename Lumina(2)Text2ImgPipeline -> Lumina(2)Pipeline (#10827)

* Rename Lumina(2)Text2ImgPipeline -> Lumina(2)Pipeline

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>

making ```formatted_images``` initialization compact (#10801)

compact writing

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>

Fix aclnnRepeatInterleaveIntWithDim error on NPU for get_1d_rotary_pos_embed (#10820)

* get_1d_rotary_pos_embed support npu

* Update src/diffusers/models/embeddings.py

---------

Co-authored-by: Kai zheng <kaizheng@KaideMacBook-Pro.local>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: YiYi Xu <yixu310@gmail.com>

[Tests] restrict memory tests for quanto for certain schemes. (#11052)

* restrict memory tests for quanto for certain schemes.

* Apply suggestions from code review

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* fixes

* style

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

[LoRA] feat: support non-diffusers wan t2v loras. (#11059)

feat: support non-diffusers wan t2v loras.

[examples/controlnet/train_controlnet_sd3.py] Fixes #11050 - Cast prompt_embeds and pooled_prompt_embeds to weight_dtype to prevent dtype mismatch (#11051)

Fix: dtype mismatch of prompt embeddings in sd3 controlnet training

Co-authored-by: Andreas Jörg <andreasjoerg@MacBook-Pro-von-Andreas-2.fritz.box>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

reverts accidental change that removes attn_mask in attn. Improves fl… (#11065)

reverts accidental change that removes attn_mask in attn. Improves flux ptxla by using flash block sizes. Moves encoding outside the for loop.

Co-authored-by: Juan Acevedo <jfacevedo@google.com>

Fix deterministic issue when getting pipeline dtype and device (#10696)

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

[Tests] add requires peft decorator. (#11037)

* add requires peft decorator.

* install peft conditionally.

* conditional deps.

Co-authored-by: DN6 <dhruv.nair@gmail.com>

---------

Co-authored-by: DN6 <dhruv.nair@gmail.com>

CogView4 Control Block (#10809)

* cogview4 control training

---------

Co-authored-by: OleehyO <leehy0357@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>

[CI] pin transformers version for benchmarking. (#11067)

pin transformers version for benchmarking.

updates

Fix Wan I2V Quality (#11087)

* fix_wan_i2v_quality

* Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update pipeline_wan_i2v.py

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>

LTX 0.9.5 (#10968)

* update

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>

make PR GPU tests conditioned on styling. (#11099)

Group offloading improvements (#11094)

update

Fix pipeline_flux_controlnet.py (#11095)

* Fix pipeline_flux_controlnet.py

* Fix style

update readme instructions. (#11096)

Co-authored-by: Juan Acevedo <jfacevedo@google.com>

Resolve stride mismatch in UNet's ResNet to support Torch DDP (#11098)

Modify UNet's ResNet implementation to resolve stride mismatch in Torch's DDP

Fix Group offloading behaviour when using streams (#11097)

* update

* update

Quality options in `export_to_video` (#11090)

* Quality options in `export_to_video`

* make style

improve more.

add placeholders for docstrings.

formatting.

smol fix.

solidify validation and annotation

* Revert "feat: pipeline-level quant config."

This reverts commit 316ff46b76.

* feat: implement pipeline-level quantization config

Co-authored-by: SunMarc <marc@huggingface.co>

* update

* fixes

* fix validation.

* add tests and other improvements.

* add tests

* import quality

* remove prints.

* add docs.

* fixes to docs.

* doc fixes.

* doc fixes.

* add validation to the input quantization_config.

* clarify recommendations.

* docs

* add to ci.

* todo.

---------

Co-authored-by: SunMarc <marc@huggingface.co>
2025-05-09 10:04:44 +05:30
Aryan
7b904941bc Cosmos (#10660)
* begin transformer conversion

* refactor

* refactor

* refactor

* refactor

* refactor

* refactor

* update

* add conversion script

* add pipeline

* make fix-copies

* remove einops

* update docs

* gradient checkpointing

* add transformer test

* update

* debug

* remove prints

* match sigmas

* add vae pt. 1

* finish CV* vae

* update

* update

* update

* update

* update

* update

* make fix-copies

* update

* make fix-copies

* fix

* update

* update

* make fix-copies

* update

* update tests

* handle device and dtype for safety checker; required in latest diffusers

* remove enable_gqa and use repeat_interleave instead

* enforce safety checker; use dummy checker in fast tests

* add review suggestion for ONNX export

Co-Authored-By: Asfiya Baig <asfiyab@nvidia.com>

* fix safety_checker issues when not passed explicitly

We could either do what's done in this commit, or update the Cosmos examples to explicitly pass the safety checker

* use cosmos guardrail package

* auto format docs

* update conversion script to support 14B models

* update name CosmosPipeline -> CosmosTextToWorldPipeline

* update docs

* fix docs

* fix group offload test failing for vae

---------

Co-authored-by: Asfiya Baig <asfiyab@nvidia.com>
2025-05-07 20:59:09 +05:30
Sayak Paul
fb29132b98 [docs] minor updates to bitsandbytes docs. (#11509)
* minor updates to bitsandbytes docs.

* Apply suggestions from code review
2025-05-06 18:52:18 +05:30
Aryan
d7ffe60166 Hunyuan Video Framepack (#11428)
* add transformer

* add pipeline

* fixes

* make fix-copies

* update

* add flux mu shift

* update example snippet

* debug

* cleanup

* batch_size=1 optimization

* add pipeline test

* fix for model cpu offloading'

* add last_image support; credits: https://github.com/lllyasviel/FramePack/pull/167

* update example with flf2v

* update penguin url

* fix test

* address review comment: https://github.com/huggingface/diffusers/pull/11428#discussion_r2071032371

* address review comment: https://github.com/huggingface/diffusers/pull/11428#discussion_r2071087689

* Update src/diffusers/pipelines/hunyuan_video/pipeline_hunyuan_video_framepack.py

---------

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2025-05-06 14:59:38 +05:30
Aryan
1fa5639438 Fix torchao docs typo for fp8 granular quantization (#11473)
update
2025-05-06 07:54:28 +05:30
Steven Liu
e23705e557 [docs] Adapters (#11331)
* refactor adapter docs

* ip-adapter

* ip adapter

* fix toctree

* fix toctree

* lora

* images

* controlnet

* feedback

* controlnet

* t2i

* fix typo

* feedback

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-05-02 08:08:33 +05:30
Steven Liu
b848d479b1 [docs] Memory optims (#11385)
* reformat

* initial

* fin

* review

* inference

* feedback

* feedback

* feedback
2025-05-01 11:22:00 -07:00
co63oc
86294d3c7f Fix typos in docs and comments (#11416)
* Fix typos in docs and comments

* Apply style fixes

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-30 20:30:53 -10:00
co63oc
f00a995753 Fix typos in strings and comments (#11407) 2025-04-24 08:53:47 -10:00
Emiliano
7986834572 Fix Flux IP adapter argument in the pipeline example (#11402)
Fix Flux IP adapter argument in the example

IP-Adapter example had a wrong argument. Fix `true_cfg` -> `true_cfg_scale`
2025-04-24 08:41:12 -10:00
Linoy Tsaban
e30d3bf544 [LoRA] add LoRA support to HiDream and fine-tuning script (#11281)
* initial commit

* initial commit

* initial commit

* initial commit

* initial commit

* initial commit

* Update examples/dreambooth/train_dreambooth_lora_hidream.py

Co-authored-by: Bagheera <59658056+bghira@users.noreply.github.com>

* move prompt embeds, pooled embeds outside

* Update examples/dreambooth/train_dreambooth_lora_hidream.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update examples/dreambooth/train_dreambooth_lora_hidream.py

Co-authored-by: hlky <hlky@hlky.ac>

* fix import

* fix import and tokenizer 4, text encoder 4 loading

* te

* prompt embeds

* fix naming

* shapes

* initial commit to add HiDreamImageLoraLoaderMixin

* fix init

* add tests

* loader

* fix model input

* add code example to readme

* fix default max length of text encoders

* prints

* nullify training cond in unpatchify for temp fix to incompatible shaping of transformer output during training

* smol fix

* unpatchify

* unpatchify

* fix validation

* flip pred and loss

* fix shift!!!

* revert unpatchify changes (for now)

* smol fix

* Apply style fixes

* workaround moe training

* workaround moe training

* remove prints

* to reduce some memory, keep vae in `weight_dtype` same as we have for flux (as it's the same vae)
bbd0c161b5/examples/dreambooth/train_dreambooth_lora_flux.py (L1207)

* refactor to align with HiDream refactor

* refactor to align with HiDream refactor

* refactor to align with HiDream refactor

* add support for cpu offloading of text encoders

* Apply style fixes

* adjust lr and rank for train example

* fix copies

* Apply style fixes

* update README

* update README

* update README

* fix license

* keep prompt2,3,4 as None in validation

* remove reverse ode comment

* Update examples/dreambooth/train_dreambooth_lora_hidream.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/dreambooth/train_dreambooth_lora_hidream.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* vae offload change

* fix text encoder offloading

* Apply style fixes

* cleaner to_kwargs

* fix module name in copied from

* add requirements

* fix offloading

* fix offloading

* fix offloading

* update transformers version in reqs

* try AutoTokenizer

* try AutoTokenizer

* Apply style fixes

* empty commit

* Delete tests/lora/test_lora_layers_hidream.py

* change tokenizer_4 to load with AutoTokenizer as well

* make text_encoder_four and tokenizer_four configurable

* save model card

* save model card

* revert T5

* fix test

* remove non diffusers lumina2 conversion

---------

Co-authored-by: Bagheera <59658056+bghira@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-22 11:44:02 +03:00
YiYi Xu
0021bfa1e1 support Wan-FLF2V (#11353)
* update transformer

---------

Co-authored-by: Aryan <aryan@huggingface.co>
2025-04-18 10:27:50 -10:00
Sayak Paul
b00a564dac [docs] add note about use_duck_shape in auraflow docs. (#11348)
add note about use_duck_shape in auraflow docs.
2025-04-17 10:25:39 +05:30
Sayak Paul
efc9d68b15 [chore] fix lora docs utils (#11338)
fix lora docs utils
2025-04-17 09:25:53 +05:30
Ishan Modi
d63e6fccb1 [BUG] fixed _toctree.yml alphabetical ordering (#11277)
update
2025-04-16 09:04:22 -07:00
Sayak Paul
ce1063acfa [docs] add a snippet for compilation in the auraflow docs. (#11327)
* add a snippet for compilation in the auraflow docs.

* include speedups.
2025-04-16 11:12:09 +05:30
Hameer Abbasi
9352a5ca56 [LoRA] Add LoRA support to AuraFlow (#10216)
* Add AuraFlowLoraLoaderMixin

* Add comments, remove qkv fusion

* Add Tests

* Add AuraFlowLoraLoaderMixin to documentation

* Add Suggested changes

* Change attention_kwargs->joint_attention_kwargs

* Rebasing derp.

* fix

* fix

* Quality fixes.

* make style

* `make fix-copies`

* `ruff check --fix`

* Attept 1 to fix tests.

* Attept 2 to fix tests.

* Attept 3 to fix tests.

* Address review comments.

* Rebasing derp.

* Get more tests passing by copying from Flux. Address review comments.

* `joint_attention_kwargs`->`attention_kwargs`

* Add `lora_scale` property for te LoRAs.

* Make test better.

* Remove useless property.

* Skip TE-only tests for AuraFlow.

* Support LoRA for non-CLIP TEs.

* Restore LoRA tests.

* Undo adding LoRA support for non-CLIP TEs.

* Undo support for TE in AuraFlow LoRA.

* `make fix-copies`

* Sync with upstream changes.

* Remove unneeded stuff.

* Mirror `Lumina2`.

* Skip for MPS.

* Address review comments.

* Remove duplicated code.

* Remove unnecessary code.

* Remove repeated docs.

* Propagate attention.

* Fix TE target modules.

* MPS fix for LoRA tests.

* Unrelated TE LoRA tests fix.

* Fix AuraFlow LoRA tests by applying to the right denoiser layers.

Co-authored-by: AstraliteHeart <81396681+AstraliteHeart@users.noreply.github.com>

* Apply style fixes

* empty commit

* Fix the repo consistency issues.

* Remove unrelated changes.

* Style.

* Fix `test_lora_fuse_nan`.

* fix quality issues.

* `pytest.xfail` -> `ValueError`.

* Add back `skip_mps`.

* Apply style fixes

* `make fix-copies`

---------

Co-authored-by: Warlord-K <warlordk28@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: AstraliteHeart <81396681+AstraliteHeart@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-15 10:41:28 +05:30
Sayak Paul
cefa28f449 [docs] Promote AutoModel usage (#11300)
* docs: promote the usage of automodel.

* bitsandbytes

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-15 09:25:40 +05:30
Beinsezii
8819cda6c0 Add skrample section to community_projects.md (#11319)
Update community_projects.md

https://github.com/huggingface/diffusers/discussions/11158#discussioncomment-12681691
2025-04-14 12:12:59 -10:00
Ishan Modi
f1f38ffbee [ControlNet] Adds controlnet for SanaTransformer (#11040)
* added controlnet for sana transformer

* improve code quality

* addressed PR comments

* bug fixes

* added test cases

* update

* added dummy objects

* addressed PR comments

* update

* Forcing update

* add to docs

* code quality

* addressed PR comments

* addressed PR comments

* update

* addressed PR comments

* added proper styling

* update

* Revert "added proper styling"

This reverts commit 344ee8a701.

* manually ordered

* Apply suggestions from code review

---------

Co-authored-by: Aryan <contact.aryanvs@gmail.com>
2025-04-13 19:19:39 +05:30
Adrien B
ed41db8525 Update autoencoderkl_allegro.md (#11303)
Correction typo
2025-04-13 09:41:30 +05:30
hlky
0ef29355c9 HiDream Image (#11231)
* HiDream Image


---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>
2025-04-11 06:31:34 -10:00
hlky
552cd32058 [docs] AutoModel (#11250)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-04-09 16:42:23 +05:30
Sayak Paul
f685981ed0 [docs] minor updates to dtype map docs. (#11237)
minor updates to dtype map docs.
2025-04-09 08:38:17 +05:30
Sayak Paul
b924251dd8 minor update to sana sprint docs. (#11236) 2025-04-09 08:17:45 +05:30
Sayak Paul
4b27c4a494 [feat] implement record_stream when using CUDA streams during group offloading (#11081)
* implement record_stream for better performance.

* fix

* style.

* merge #11097

* Update src/diffusers/hooks/group_offloading.py

Co-authored-by: Aryan <aryan@huggingface.co>

* fixes

* docstring.

* remaining todos in low_cpu_mem_usage

* tests

* updates to docs.

---------

Co-authored-by: Aryan <aryan@huggingface.co>
2025-04-08 21:17:49 +05:30