sayakpaul
50dea89dc6
Release: v0.34.0
v0.34.0
2025-06-24 20:22:13 +05:30
Sayak Paul
d3e27e05f0
guard omnigen processor. ( #11799 )
2025-06-24 19:15:34 +05:30
Aryan
5df02fc171
[tests] Fix group offloading and layerwise casting test interaction ( #11796 )
...
* update
* update
* update
2025-06-24 17:33:32 +05:30
Sayak Paul
7392c8ff5a
[chore] raise as early as possible in group offloading ( #11792 )
...
* raise as early as possible in group offloading
* remove check from ModuleGroup
2025-06-24 15:05:23 +05:30
Aryan
474a248f10
[tests] Fix HunyuanVideo Framepack device tests ( #11789 )
...
update
2025-06-24 13:49:37 +05:30
YiYi Xu
7bc0a07b19
[lora] only remove hooks that we add back ( #11768 )
...
up
2025-06-23 16:49:19 -10:00
Sayak Paul
92542719ed
[docs] minor cleanups in the lora docs. ( #11770 )
...
* minor cleanups in the lora docs.
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* format docs
* fix copies
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-06-24 08:10:07 +05:30
imbr92
6760300202
Add --lora_alpha and metadata handling to train_dreambooth_lora_sana.py ( #11744 )
...
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com >
2025-06-23 15:46:44 +03:00
Yuanchen Guo
798265f2b6
[Wan] Fix mask padding in Wan VACE pipeline. ( #11778 )
2025-06-23 16:28:21 +05:30
Dhruv Nair
cd813499be
[CI] Skip ONNX Upscale tests ( #11774 )
...
update
2025-06-23 12:14:01 +05:30
Sayak Paul
fbddf02807
[tests] properly skip tests instead of return ( #11771 )
...
model test updates
2025-06-23 11:59:59 +05:30
Yao Matrix
f20b83a04f
enable cpu offloading of new pipelines on XPU & use device agnostic empty to make pipelines work on XPU ( #11671 )
...
* commit 1
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* patch 2
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* Update pipeline_pag_sana.py
* Update pipeline_sana.py
* Update pipeline_sana_controlnet.py
* Update pipeline_sana_sprint_img2img.py
* Update pipeline_sana_sprint.py
* fix style
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* fix fat-thumb while merge conflict
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* fix ci issues
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
---------
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com >
2025-06-23 09:44:16 +05:30
jiqing-feng
ee40088fe5
enable deterministic in bnb 4 bit tests ( #11738 )
...
* enable deterministic in bnb 4 bit tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix 8bit test
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
2025-06-23 08:17:36 +05:30
Tolga Cangöz
7fc53b5d66
Fix dimensionalities in apply_rotary_emb functions' comments ( #11717 )
...
Fix dimensionality in `apply_rotary_emb` functions' comments.
2025-06-21 12:09:28 -10:00
Steven Liu
0874dd04dc
[docs] LoRA scale scheduling ( #11727 )
...
draft
2025-06-20 10:15:29 -07:00
Steven Liu
6184d8a433
[docs] device_map ( #11711 )
...
draft
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
2025-06-20 10:14:48 -07:00
Steven Liu
5a6e386464
[docs] Quantization + torch.compile + offloading ( #11703 )
...
* draft
* feedback
* update
* feedback
* fix
* feedback
* feedback
* fix
* feedback
2025-06-20 10:11:39 -07:00
Dhruv Nair
42077e6c73
Fix failing cpu offload test for LTX Latent Upscale ( #11755 )
...
update
2025-06-20 06:07:34 +02:00
Sayak Paul
3d8d8485fc
fix invalid component handling behaviour in PipelineQuantizationConfig ( #11750 )
...
* start
* updates
2025-06-20 07:54:12 +05:30
Dhruv Nair
195926bbdc
Update Chroma Docs ( #11753 )
...
* update
* update
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
2025-06-19 19:33:19 +02:00
Sayak Paul
85a916bb8b
make group offloading work with disk/nvme transfers ( #11682 )
...
* start implementing disk offloading in group.
* delete diff file.
* updates.patch
* offload_to_disk_path
* check if safetensors already exist.
* add test and clarify.
* updates
* update todos.
* update more docs.
* update docs
2025-06-19 18:09:30 +05:30
Dhruv Nair
3287ce2890
Fix HiDream pipeline test module ( #11754 )
...
update
2025-06-19 17:06:14 +05:30
Dhruv Nair
0c11c8c1ac
[CI] Fix SANA tests ( #11756 )
...
update
2025-06-19 17:06:02 +05:30
Dhruv Nair
fc51583c8a
[CI] Fix WAN VACE tests ( #11757 )
...
update
2025-06-19 17:03:12 +05:30
Sayak Paul
fb57c76aa1
[LoRA] refactor lora loading at the model-level ( #11719 )
...
* factor out stuff from load_lora_adapter().
* simplifying text encoder lora loading.
* fix peft.py
* fix logging locations.
* formatting
* fix
* update
* update
* update
2025-06-19 13:06:25 +05:30
dependabot[bot]
7251bb4fd0
Bump urllib3 from 2.2.3 to 2.5.0 in /examples/server ( #11748 )
...
Bumps [urllib3](https://github.com/urllib3/urllib3 ) from 2.2.3 to 2.5.0.
- [Release notes](https://github.com/urllib3/urllib3/releases )
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst )
- [Commits](https://github.com/urllib3/urllib3/compare/2.2.3...2.5.0 )
---
updated-dependencies:
- dependency-name: urllib3
dependency-version: 2.5.0
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-19 11:09:33 +05:30
Aryan
3fba74e153
Add missing HiDream license ( #11747 )
...
update
2025-06-19 08:07:47 +05:30
Aryan
a4df8dbc40
Update more licenses to 2025 ( #11746 )
...
update
2025-06-19 07:46:01 +05:30
Sayak Paul
48eae6f420
[Quantizers] add is_compileable property to quantizers. ( #11736 )
...
add is_compileable property to quantizers.
2025-06-19 07:45:06 +05:30
Dhruv Nair
66394bf6c7
Chroma Follow Up ( #11725 )
...
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* updte
* update
* update
* update
2025-06-18 22:24:41 +05:30
Sayak Paul
62cce3045d
[chore] change to 2025 licensing for remaining ( #11741 )
...
change to 2025 licensing for remaining
2025-06-18 20:56:00 +05:30
Sayak Paul
05e867784d
[tests] device_map tests for all models. ( #11708 )
...
* device_map tests for all models.
* updates
* Update tests/models/test_modeling_common.py
Co-authored-by: Aryan <aryan@huggingface.co >
* fix device_map in test
---------
Co-authored-by: Aryan <aryan@huggingface.co >
2025-06-18 10:52:06 +05:30
Leo Jiang
d72184eba3
[training] add ds support to lora hidream ( #11737 )
...
* [training] add ds support to lora hidream
* Apply style fixes
---------
Co-authored-by: J石页 <jiangshuo9@h-partners.com >
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-06-18 09:26:02 +05:30
Saurabh Misra
5ce4814af1
⚡ ️ Speed up method AutoencoderKLWan.clear_cache by 886% ( #11665 )
...
* ⚡ ️ Speed up method `AutoencoderKLWan.clear_cache` by 886%
**Key optimizations:**
- Compute the number of `WanCausalConv3d` modules in each model (`encoder`/`decoder`) **only once during initialization**, store in `self._cached_conv_counts`. This removes unnecessary repeated tree traversals at every `clear_cache` call, which was the main bottleneck (from profiling).
- The internal helper `_count_conv3d_fast` is optimized via a generator expression with `sum` for efficiency.
All comments from the original code are preserved, except for updated or removed local docstrings/comments relevant to changed lines.
**Function signatures and outputs remain unchanged.**
* Apply style fixes
* Apply suggestions from code review
Co-authored-by: Aryan <contact.aryanvs@gmail.com >
* Apply style fixes
---------
Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aryan <aryan@huggingface.co >
Co-authored-by: Aryan <contact.aryanvs@gmail.com >
Co-authored-by: Aseem Saxena <aseem.bits@gmail.com >
2025-06-18 08:46:03 +05:30
Linoy Tsaban
1bc6f3dc0f
[LoRA training] update metadata use for lora alpha + README ( #11723 )
...
* lora alpha
* Apply style fixes
* Update examples/advanced_diffusion_training/README_flux.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
* fix readme format
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
2025-06-17 12:19:27 +03:00
Aryan
79bd7ecc78
Support more Wan loras (VACE) ( #11726 )
...
update
2025-06-17 10:39:18 +05:30
David Berenstein
9b834f8710
Add Pruna optimization framework documentation ( #11688 )
...
* Add Pruna optimization framework documentation
- Introduced a new section for Pruna in the table of contents.
- Added comprehensive documentation for Pruna, detailing its optimization techniques, installation instructions, and examples for optimizing and evaluating models
* Enhance Pruna documentation with image alt text and code block formatting
- Added alt text to images for better accessibility and context.
- Changed code block syntax from diff to python for improved clarity.
* Add installation section to Pruna documentation
- Introduced a new installation section in the Pruna documentation to guide users on how to install the framework.
- Enhanced the overall clarity and usability of the documentation for new users.
* Update pruna.md
* Update pruna.md
* Update Pruna documentation for model optimization and evaluation
- Changed section titles for consistency and clarity, from "Optimizing models" to "Optimize models" and "Evaluating and benchmarking optimized models" to "Evaluate and benchmark models".
- Enhanced descriptions to clarify the use of `diffusers` models and the evaluation process.
- Added a new example for evaluating standalone `diffusers` models.
- Updated references and links for better navigation within the documentation.
* Refactor Pruna documentation for clarity and consistency
- Removed outdated references to FLUX-juiced and streamlined the explanation of benchmarking.
- Enhanced the description of evaluating standalone `diffusers` models.
- Cleaned up code examples by removing unnecessary imports and comments for better readability.
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Enhance Pruna documentation with new examples and clarifications
- Added an image to illustrate the optimization process.
- Updated the explanation for sharing and loading optimized models on the Hugging Face Hub.
- Clarified the evaluation process for optimized models using the EvaluationAgent.
- Improved descriptions for defining metrics and evaluating standalone diffusers models.
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-06-16 12:25:05 -07:00
Carl Thomé
81426b0f19
Fix misleading comment ( #11722 )
2025-06-16 08:47:00 -10:00
Sayak Paul
f0dba33d82
[training] show how metadata stuff should be incorporated in training scripts. ( #11707 )
...
* show how metadata stuff should be incorporated in training scripts.
* typing
* fix
---------
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com >
2025-06-16 16:42:34 +05:30
Sayak Paul
d1db4f853a
[LoRA ]fix flux lora loader when return_metadata is true for non-diffusers ( #11716 )
...
* fix flux lora loader when return_metadata is true for non-diffusers
* remove annotation
2025-06-16 14:26:35 +05:30
Edna
8adc6003ba
Chroma Pipeline ( #11698 )
...
* working state from hameerabbasi and iddl
* working state form hameerabbasi and iddl (transformer)
* working state (normalization)
* working state (embeddings)
* add chroma loader
* add chroma to mappings
* add chroma to transformer init
* take out variant stuff
* get decently far in changing variant stuff
* add chroma init
* make chroma output class
* add chroma transformer to dummy tp
* add chroma to init
* add chroma to init
* fix single file
* update
* update
* add chroma to auto pipeline
* add chroma to pipeline init
* change to chroma transformer
* take out variant from blocks
* swap embedder location
* remove prompt_2
* work on swapping text encoders
* remove mask function
* dont modify mask (for now)
* wrap attn mask
* no attn mask (can't get it to work)
* remove pooled prompt embeds
* change to my own unpooled embeddeer
* fix load
* take pooled projections out of transformer
* ensure correct dtype for chroma embeddings
* update
* use dn6 attn mask + fix true_cfg_scale
* use chroma pipeline output
* use DN6 embeddings
* remove guidance
* remove guidance embed (pipeline)
* remove guidance from embeddings
* don't return length
* dont change dtype
* remove unused stuff, fix up docs
* add chroma autodoc
* add .md (oops)
* initial chroma docs
* undo don't change dtype
* undo arxiv change
unsure why that happened
* fix hf papers regression in more places
* Update docs/source/en/api/pipelines/chroma.md
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
* do_cfg -> self.do_classifier_free_guidance
* Update docs/source/en/api/models/chroma_transformer.md
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
* Update chroma.md
* Move chroma layers into transformer
* Remove pruned AdaLayerNorms
* Add chroma fast tests
* (untested) batch cond and uncond
* Add # Copied from for shift
* Update # Copied from statements
* update norm imports
* Revert cond + uncond batching
* Add transformer tests
* move chroma test (oops)
* chroma init
* fix chroma pipeline fast tests
* Update src/diffusers/models/transformers/transformer_chroma.py
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
* Move Approximator and Embeddings
* Fix auto pipeline + make style, quality
* make style
* Apply style fixes
* switch to new input ids
* fix # Copied from error
* remove # Copied from on protected members
* try to fix import
* fix import
* make fix-copes
* revert style fix
* update chroma transformer params
* update chroma transformer approximator init params
* update to pad tokens
* fix batch inference
* Make more pipeline tests work
* Make most transformer tests work
* fix docs
* make style, make quality
* skip batch tests
* fix test skipping
* fix test skipping again
* fix for tests
* Fix all pipeline test
* update
* push local changes, fix docs
* add encoder test, remove pooled dim
* default proj dim
* fix tests
* fix equal size list input
* update
* push local changes, fix docs
* add encoder test, remove pooled dim
* default proj dim
* fix tests
* fix equal size list input
* Revert "fix equal size list input"
This reverts commit 3fe4ad67d5 .
* update
* update
* update
* update
* update
---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-06-14 06:52:56 +05:30
Aryan
9f91305f85
Cosmos Predict2 ( #11695 )
...
* support text-to-image
* update example
* make fix-copies
* support use_flow_sigmas in EDM scheduler instead of maintain cosmos-specific scheduler
* support video-to-world
* update
* rename text2image pipeline
* make fix-copies
* add t2i test
* add test for v2w pipeline
* support edm dpmsolver multistep
* update
* update
* update
* update tests
* fix tests
* safety checker
* make conversion script work without guardrail
2025-06-14 01:51:29 +05:30
Sayak Paul
368958df6f
[LoRA] parse metadata from LoRA and save metadata ( #11324 )
...
* feat: parse metadata from lora state dicts.
* tests
* fix tests
* key renaming
* fix
* smol update
* smol updates
* load metadata.
* automatically save metadata in save_lora_adapter.
* propagate changes.
* changes
* add test to models too.
* tigher tests.
* updates
* fixes
* rename tests.
* sorted.
* Update src/diffusers/loaders/lora_base.py
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com >
* review suggestions.
* removeprefix.
* propagate changes.
* fix-copies
* sd
* docs.
* fixes
* get review ready.
* one more test to catch error.
* change to a different approach.
* fix-copies.
* todo
* sd3
* update
* revert changes in get_peft_kwargs.
* update
* fixes
* fixes
* simplify _load_sft_state_dict_metadata
* update
* style fix
* uipdate
* update
* update
* empty commit
* _pack_dict_with_prefix
* update
* TODO 1.
* todo: 2.
* todo: 3.
* update
* update
* Apply suggestions from code review
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com >
* reraise.
* move argument.
---------
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com >
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com >
2025-06-13 14:37:49 +05:30
Aryan
e52ceae375
Support Wan AccVideo lora ( #11704 )
...
* update
* make style
* Update src/diffusers/loaders/lora_conversion_utils.py
* add note explaining threshold
2025-06-13 11:55:08 +05:30
Sayak Paul
62cbde8d41
[docs] mention fp8 benefits on supported hardware. ( #11699 )
...
* mention fp8 benefits on supported hardware.
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-06-13 07:17:03 +05:30
Sayak Paul
648e8955cf
swap out token for style bot. ( #11701 )
2025-06-13 06:51:19 +05:30
Sayak Paul
00b179fb1a
[docs] add compilation bits to the bitsandbytes docs. ( #11693 )
...
* add compilation bits to the bitsandbytes docs.
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* finish
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-06-12 08:49:24 +05:30
Tolga Cangöz
47ef79464f
Apply Occam's Razor in position embedding calculation ( #11562 )
...
* fix: remove redundant indexing
* style
2025-06-11 13:47:37 -10:00
Joel Schlosser
b272807bc8
Avoid DtoH sync from access of nonzero() item in scheduler ( #11696 )
2025-06-11 12:03:40 -10:00
rasmi
447ccd0679
Set _torch_version to N/A if torch is disabled. ( #11645 )
2025-06-11 11:59:54 -10:00