* Add `WanAnimateTransformer3DModel` to `SINGLE_FILE_LOADABLE_CLASSES`
* Fixed dtype mismatch when loading a single file
* Fixed a bug that results in white noise for generation
* Update dtype check for time embedder - caused white noise output
* Improve code readability
* Optimize dtype handling
Removed unnecessary dtype conversions for timestep and weight.
* Apply style fixes
* Refactor time embedding dtype handling
Adjust time embedding type conversion for compatibility.
* Apply style fixes
* Modify comment for WanTimeTextImageEmbedding class
---------
Co-authored-by: Sam Edwards <sam.edwards1976@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* up
* up up
* update outputs
* style
* add modular_auto_docstring!
* more auto docstring
* style
* up up up
* more more
* up
* address feedbacks
* add TODO in the description for empty docstring
* refactor based on dhruv's feedback: remove the class method
* add template method
* up
* up up up
* apply auto docstring
* make style
* rmove space in make docstring
* Apply suggestions from code review
* revert change in z
* fix
* Apply style fixes
* include auto-docstring check in the modular ci. (#13004)
* Run ruff format after auto docstring generation
* up
* upup
* upup
* style
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Feature: Add BriaFiboEditPipeline to diffusers
* Introduced BriaFiboEditPipeline class with necessary backend requirements.
* Updated import structures in relevant modules to include BriaFiboEditPipeline.
* Ensured compatibility with existing pipelines and type checking.
* Feature: Introduce Bria Fibo Edit Pipeline
* Added BriaFiboEditPipeline class for structured JSON-native image editing.
* Created documentation for the new pipeline in bria_fibo_edit.md.
* Updated import structures to include the new pipeline and its components.
* Added unit tests for the BriaFiboEditPipeline to ensure functionality and correctness.
* Enhancement: Update Bria Fibo Edit Pipeline and Documentation
* Refined the Bria Fibo Edit model description for clarity and detail.
* Added usage instructions for model authentication and login.
* Implemented mask handling functions in the BriaFiboEditPipeline for improved image editing capabilities.
* Updated unit tests to cover new mask functionalities and ensure input validation.
* Adjusted example code in documentation to reflect changes in the pipeline's usage.
* Update Bria Fibo Edit documentation with corrected Hugging Face page link
* add dreambooth training script
* style and quality
* Delete temp.py
* Enhancement: Improve JSON caption validation in DreamBoothDataset
* Updated the clean_json_caption function to handle both string and dictionary inputs for captions.
* Added error handling to raise a ValueError for invalid caption types, ensuring better input validation.
* Add datasets dependency to requirements_fibo_edit.txt
* Add bria_fibo_edit to docs table of contents
* Fix dummy objects ordering
* Fix BriaFiboEditPipeline to use passed generator parameter
The pipeline was ignoring the generator parameter and only using
the seed parameter. This caused non-deterministic outputs in tests
that pass a seeded generator.
* Remove fibo_edit training script and related files
---------
Co-authored-by: kfirbria <kfir@bria.ai>
* gracefully error out when attn-backend x cp combo isn't supported.
* Revert "gracefully error out when attn-backend x cp combo isn't supported."
This reverts commit c8abb5d7c0.
* gracefully error out when attn-backend x cp combo isn't supported.
* up
* address PR feedback.
* up
* Update src/diffusers/models/modeling_utils.py
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
* dot.
---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
* LTX 2 transformer single file support
* LTX 2 video VAE single file support
* LTX 2 audio VAE single file support
* Make it easier to distinguish LTX 1 and 2 models
* Improve incorrect LoRA format error message
* Add flag in PeftLoraLoaderMixinTests to disable text encoder LoRA tests
* Apply changes to LTX2LoraTests
* Further improve incorrect LoRA format error msg following review
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Fix QwenImage txt_seq_lens handling
* formatting
* formatting
* remove txt_seq_lens and use bool mask
* use compute_text_seq_len_from_mask
* add seq_lens to dispatch_attention_fn
* use joint_seq_lens
* remove unused index_block
* WIP: Remove seq_lens parameter and use mask-based approach
- Remove seq_lens parameter from dispatch_attention_fn
- Update varlen backends to extract seqlens from masks
- Update QwenImage to pass 2D joint_attention_mask
- Fix native backend to handle 2D boolean masks
- Fix sage_varlen seqlens_q to match seqlens_k for self-attention
Note: sage_varlen still producing black images, needs further investigation
* fix formatting
* undo sage changes
* xformers support
* hub fix
* fix torch compile issues
* fix tests
* use _prepare_attn_mask_native
* proper deprecation notice
* add deprecate to txt_seq_lens
* Update src/diffusers/models/transformers/transformer_qwenimage.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>
* Update src/diffusers/models/transformers/transformer_qwenimage.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>
* Only create the mask if there's actual padding
* fix order of docstrings
* Adds performance benchmarks and optimization details for QwenImage
Enhances documentation with comprehensive performance insights for QwenImage pipeline:
* rope_text_seq_len = text_seq_len
* rename to max_txt_seq_len
* removed deprecated args
* undo unrelated change
* Updates QwenImage performance documentation
Removes detailed attention backend benchmarks and simplifies torch.compile performance description
Focuses on key performance improvement with torch.compile, highlighting the specific speedup from 4.70s to 1.93s on an A100 GPU
Streamlines the documentation to provide more concise and actionable performance insights
* Updates deprecation warnings for txt_seq_lens parameter
Extends deprecation timeline for txt_seq_lens from version 0.37.0 to 0.39.0 across multiple Qwen image-related models
Adds a new unit test to verify the deprecation warning behavior for the txt_seq_lens parameter
* fix compile
* formatting
* fix compile tests
* rename helper
* remove duplicate
* smaller values
* removed
* use torch.cond for torch compile
* Construct joint attention mask once
* test different backends
* construct joint attention mask once to avoid reconstructing in every block
* Update src/diffusers/models/attention_dispatch.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>
* formatting
* raising an error from the EditPlus pipeline when batch_size > 1
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: cdutr <dutra_carlos@hotmail.com>