1
0
mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00
Commit Graph

5220 Commits

Author SHA1 Message Date
Sayak Paul
e4b056fe65 [LoRA] support wan i2v loras from the world. (#11025)
* support wan i2v loras from the world.

* remove copied from.

* upates

* add lora.
2025-03-11 20:43:29 +05:30
Eliseu Silva
4e3ddd5afa fix: mixture tiling sdxl pipeline - adjust gerating time_ids & embeddings (#11012)
small fix on generating time_ids & embeddings
2025-03-11 04:20:18 -03:00
Dhruv Nair
9add071592 [Quantization] Allow loading TorchAO serialized Tensor objects with torch>=2.6 (#11018)
* update

* update

* update

* update

* update

* update

* update

* update

* update
2025-03-11 10:52:01 +05:30
Tolga Cangöz
b88fef4785 [Research Project] Add AnyText: Multilingual Visual Text Generation And Editing (#8998)
* Add initial template

* Second template

* feat: Add TextEmbeddingModule to AnyTextPipeline

* feat: Add AuxiliaryLatentModule template to AnyTextPipeline

* Add bert tokenizer from the anytext repo for now

* feat: Update AnyTextPipeline's modify_prompt method

This commit adds improvements to the modify_prompt method in the AnyTextPipeline class. The method now handles special characters and replaces selected string prompts with a placeholder. Additionally, it includes a check for Chinese text and translation using the trans_pipe.

* Fill in the `forward` pass of `AuxiliaryLatentModule`

* `make style && make quality`

* `chore: Update bert_tokenizer.py with a TODO comment suggesting the use of the transformers library`

* Update error handling to raise and logging

* Add `create_glyph_lines` function into `TextEmbeddingModule`

* make style

* Up

* Up

* Up

* Up

* Remove several comments

* refactor: Remove ControlNetConditioningEmbedding and update code accordingly

* Up

* Up

* up

* refactor: Update AnyTextPipeline to include new optional parameters

* up

* feat: Add OCR model and its components

* chore: Update `TextEmbeddingModule` to include OCR model components and dependencies

* chore: Update `AuxiliaryLatentModule` to include VAE model and its dependencies for masked image in the editing task

* `make style`

* refactor: Update `AnyTextPipeline`'s docstring

* Update `AuxiliaryLatentModule` to include info dictionary so that text processing is done once

* simplify

* `make style`

* Converting `TextEmbeddingModule` to ordinary `encode_prompt()` function

* Simplify for now

* `make style`

* Up

* feat: Add scripts to convert AnyText controlnet to diffusers

* `make style`

* Fix: Move glyph rendering to `TextEmbeddingModule` from `AuxiliaryLatentModule`

* make style

* Up

* Simplify

* Up

* feat: Add safetensors module for loading model file

* Fix device issues

* Up

* Up

* refactor: Simplify

* refactor: Simplify code for loading models and handling data types

* `make style`

* refactor: Update to() method in FrozenCLIPEmbedderT3 and TextEmbeddingModule

* refactor: Update dtype in embedding_manager.py to match proj.weight

* Up

* Add attribution and adaptation information to pipeline_anytext.py

* Update usage example

* Will refactor `controlnet_cond_embedding` initialization

* Add `AnyTextControlNetConditioningEmbedding` template

* Refactor organization

* style

* style

* Move custom blocks from `AuxiliaryLatentModule` to `AnyTextControlNetConditioningEmbedding`

* Follow one-file policy

* style

* [Docs] Update README and pipeline_anytext.py to use AnyTextControlNetModel

* [Docs] Update import statement for AnyTextControlNetModel in pipeline_anytext.py

* [Fix] Update import path for ControlNetModel, ControlNetOutput in anytext_controlnet.py

* Refactor AnyTextControlNet to use configurable conditioning embedding channels

* Complete control net conditioning embedding in AnyTextControlNetModel

* up

* [FIX] Ensure embeddings use correct device in AnyTextControlNetModel

* up

* up

* style

* [UPDATE] Revise README and example code for AnyTextPipeline integration with DiffusionPipeline

* [UPDATE] Update example code in anytext.py to use correct font file and improve clarity

* down

* [UPDATE] Refactor BasicTokenizer usage to a new Checker class for text processing

* update pillow

* [UPDATE] Remove commented-out code and unnecessary docstring in anytext.py and anytext_controlnet.py for improved clarity

* [REMOVE] Delete frozen_clip_embedder_t3.py as it is in the anytext.py file

* [UPDATE] Replace edict with dict for configuration in anytext.py and RecModel.py for consistency

* 🆙

* style

* [UPDATE] Revise README.md for clarity, remove unused imports in anytext.py, and add author credits in anytext_controlnet.py

* style

* Update examples/research_projects/anytext/README.md

Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Remove commented-out image preparation code in AnyTextPipeline

* Remove unnecessary blank line in README.md
2025-03-11 01:49:37 +05:30
Sayak Paul
e7e6d85282 [Tests] improve quantization tests by additionally measuring the inference memory savings (#11021)
* memory usage tests

* fixes

* gguf
2025-03-10 21:42:24 +05:30
Aryan
8eefed65bd [LoRA] CogView4 (#10981)
* update

* make fix-copies

* update
2025-03-10 20:24:05 +05:30
Sayak Paul
26149c0ecd [LoRA] Improve warning messages when LoRA loading becomes a no-op (#10187)
* updates

* updates

* updates

* updates

* notebooks revert

* fix-copies.

* seeing

* fix

* revert

* fixes

* fixes

* fixes

* remove print

* fix

* conflicts ii.

* updates

* fixes

* better filtering of prefix.

---------

Co-authored-by: hlky <hlky@hlky.ac>
2025-03-10 09:28:32 +05:30
Ishan Modi
0703ce8800 [Single File] Add single file loading for SANA Transformer (#10947)
* added support for from_single_file

* added diffusers mapping script

* added testcase

* bug fix

* updated tests

* corrected code quality

* corrected code quality

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-03-10 08:38:30 +05:30
Dhruv Nair
f5edaa7894 [Quantization] Add Quanto backend (#10756)
* update

* updaet

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update docs/source/en/quantization/quanto.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update src/diffusers/quantizers/quanto/utils.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-03-10 08:33:05 +05:30
Dhruv Nair
9a1810f0de Fix for fetching variants only (#10646)
* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update
2025-03-10 07:45:44 +05:30
Sayak Paul
1fddee211e [LoRA] Improve copied from comments in the LoRA loader classes (#10995)
* more sanity of mind with copied from ...

* better

* better
2025-03-08 19:59:21 +05:30
Kinam Kim
b38450d5d2 Add STG to community pipelines (#10960)
* Support STG for video pipelines

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update pipeline_stg_cogvideox.py

* Update pipeline_stg_hunyuan_video.py

* Update pipeline_stg_ltx.py

* Update pipeline_stg_ltx_image2video.py

* Update pipeline_stg_mochi.py

* Update pipeline_stg_hunyuan_video.py

* Update pipeline_stg_ltx.py

* Update pipeline_stg_ltx_image2video.py

* Update pipeline_stg_mochi.py

* update

* remove rescaling

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-03-08 00:28:24 +05:30
Dhruv Nair
1357931d74 [Single File] Add single file support for Wan T2V/I2V (#10991)
* update

* update

* update

* update

* update

* update

* update
2025-03-07 22:13:25 +05:30
Sayak Paul
a2d3d6af44 [LoRA] remove full key prefix from peft. (#11004)
remove full key prefix from peft.
2025-03-07 21:51:59 +05:30
hlky
363d1ab7e2 Wan VAE move scaling to pipeline (#10998) 2025-03-07 10:42:17 +00:00
C
6a0137eb3b Fix Graph Breaks When Compiling CogView4 (#10959)
* Fix Graph Breaks When Compiling CogView4

Eliminate this:

```
t]V0304 10:24:23.421000 3131076 torch/_dynamo/guards.py:2813] [0/4] [__recompiles] Recompiling function forward in /home/zeyi/repos/diffusers/src/diffusers/models/transformers/transformer_cogview4.py:374
V0304 10:24:23.421000 3131076 torch/_dynamo/guards.py:2813] [0/4] [__recompiles]     triggered by the following guard failure(s):
V0304 10:24:23.421000 3131076 torch/_dynamo/guards.py:2813] [0/4] [__recompiles]     - 0/3: ___check_obj_id(L['self'].rope.freqs_h, 139976127328032)    
V0304 10:24:23.421000 3131076 torch/_dynamo/guards.py:2813] [0/4] [__recompiles]     - 0/2: ___check_obj_id(L['self'].rope.freqs_h, 139976107780960)    
V0304 10:24:23.421000 3131076 torch/_dynamo/guards.py:2813] [0/4] [__recompiles]     - 0/1: ___check_obj_id(L['self'].rope.freqs_h, 140022511848960)    
V0304 10:24:23.421000 3131076 torch/_dynamo/guards.py:2813] [0/4] [__recompiles]     - 0/0: ___check_obj_id(L['self'].rope.freqs_h, 140024081342416)   
```

* Update transformer_cogview4.py

* fix cogview4 rotary pos embed

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-03-06 22:57:17 -10:00
Aryan
2e5203be04 Hunyuan I2V (#10983)
* update

* update

* update

* add tests

* update

* add model tests

* update docs

* update

* update example

* fix defaults

* update
2025-03-07 12:52:48 +05:30
yupeng1111
d55f41102a fix wan i2v pipeline bugs (#10975)
* fix wan i2v pipeline bugs

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-03-06 18:57:41 -10:00
LittleNyima
748cb0fab6 Add CogVideoX DDIM Inversion to Community Pipelines (#10956)
* add cogvideox ddim inversion script

* implement as a pipeline, and add documentation

---------

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2025-03-06 10:46:38 -10:00
Dhruv Nair
790a909b54 [Single File] Add user agent to SF download requests. (#10979)
update
2025-03-06 10:45:20 -10:00
CyberVy
54ab475391 Fix Flux Controlnet Pipeline _callback_tensor_inputs Missing Some Elements (#10974)
* Update pipeline_flux_controlnet.py

* Update pipeline_flux_controlnet_image_to_image.py

* Update pipeline_flux_controlnet_inpainting.py

* Update pipeline_flux_controlnet_inpainting.py

* Update pipeline_flux_controlnet_inpainting.py
2025-03-06 14:26:20 -03:00
dependabot[bot]
f103993094 Bump jinja2 from 3.1.5 to 3.1.6 in /examples/research_projects/realfill (#10984)
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.5 to 3.1.6.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/3.1.5...3.1.6)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-06 11:59:51 +00:00
Sayak Paul
1be0202502 [CI] remove synchornized. (#10980)
removed synchornized.
2025-03-06 17:03:19 +05:30
Pierre Chapuis
ea81a4228d fix default values of Flux guidance_scale in docstrings (#10982) 2025-03-06 16:37:45 +05:30
hlky
b15027636a Fix loading OneTrainer Flux LoRA (#10978)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-03-06 13:53:36 +05:30
Sayak Paul
6e2a93de70 [tests] fix tests for save load components (#10977)
fix tests
2025-03-06 12:30:37 +05:30
Jun Yeop Na
37b8edfb86 [train_dreambooth_lora.py] Fix the LR Schedulers when num_train_epochs is passed in a distributed training env (#10973)
* updated train_dreambooth_lora to fix the LR schedulers for `num_train_epochs` in distributed training env

* fixed formatting

* remove trailing newlines

* fixed style error
2025-03-06 10:06:24 +05:30
Célina
fbf6b856cc use style bot GH Action from huggingface_hub (#10970)
use style bot GH action from hfh

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-03-05 23:39:50 +05:30
Linoy Tsaban
e031caf4ea [flux lora training] fix t5 training bug (#10845)
* fix t5 training bug

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-03-05 13:47:01 +02:00
hlky
08f74a8b92 Add VAE Decode endpoint slow test (#10946) 2025-03-05 11:28:06 +00:00
YiYi Xu
24c062aaa1 update check_input for cogview4 (#10966)
fix
2025-03-04 12:12:54 -10:00
Yuxuan Zhang
a74f02fb40 [Docs] CogView4 comment fix (#10957)
* Update pipeline_cogview4.py

* Use GLM instead of T5 in doc
2025-03-04 11:25:43 -10:00
Eliseu Silva
66bf7ea5be feat: add Mixture-of-Diffusers ControlNet Tile upscaler Pipeline for SDXL (#10951)
* feat: add Mixture-of-Diffusers ControlNet Tile upscaler Pipeline for SDXL

* make style make quality
2025-03-04 17:17:36 -03:00
Alexey Zolotenkov
b8215b1c06 Fix incorrect seed initialization when args.seed is 0 (#10964)
* Fix seed initialization to handle args.seed = 0 correctly

* Apply style fixes

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-03-04 10:09:52 -10:00
Aryan
3ee899fa0c [LoRA] Support Wan (#10943)
* update

* refactor image-to-video pipeline

* update

* fix copied from

* use FP32LayerNorm
2025-03-05 01:27:34 +05:30
CyberVy
dcd77ce222 Fix the missing parentheses when calling is_torchao_available in quantization_config.py. (#10961)
Update quantization_config.py
2025-03-04 09:52:41 -03:00
a120092009
11d8e3ce2c [Quantization] support pass MappingType for TorchAoConfig (#10927)
* [Quantization] support pass MappingType for TorchAoConfig

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-03-04 16:40:50 +05:30
Sayak Paul
97fda1b75c [LoRA] feat: support non-diffusers lumina2 LoRAs. (#10909)
* feat: support non-diffusers lumina2 LoRAs.

* revert ipynb changes (but I don't know why this is required ☹️)

* empty

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-03-04 14:40:55 +05:30
Sayak Paul
cc22058324 Update evaluation.md (#10938)
* Update evaluation.md

* Update docs/source/en/conceptual/evaluation.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-03-04 13:58:16 +05:30
Fanli Lin
7855ac597e [tests] make tests device-agnostic (part 4) (#10508)
* initial comit

* fix empty cache

* fix one more

* fix style

* update device functions

* update

* update

* Update src/diffusers/utils/testing_utils.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/utils/testing_utils.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/utils/testing_utils.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/controlnet/test_controlnet.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/utils/testing_utils.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/utils/testing_utils.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/controlnet/test_controlnet.py

Co-authored-by: hlky <hlky@hlky.ac>

* with gc.collect

* update

* make style

* check_torch_dependencies

* add mps empty cache

* add changes

* bug fix

* enable on xpu

* update more cases

* revert

* revert back

* Update test_stable_diffusion_xl.py

* Update tests/pipelines/stable_diffusion/test_stable_diffusion.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/stable_diffusion/test_stable_diffusion.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/stable_diffusion/test_stable_diffusion_img2img.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/stable_diffusion/test_stable_diffusion_img2img.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/stable_diffusion/test_stable_diffusion_img2img.py

Co-authored-by: hlky <hlky@hlky.ac>

* Apply suggestions from code review

Co-authored-by: hlky <hlky@hlky.ac>

* add test marker

---------

Co-authored-by: hlky <hlky@hlky.ac>
2025-03-04 08:26:06 +00:00
CyberVy
30cef6bff3 Improve load_ip_adapter RAM Usage (#10948)
* Update ip_adapter.py

* Update ip_adapter.py

* Update ip_adapter.py

* Update ip_adapter.py

* Update ip_adapter.py

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>
2025-03-04 07:21:23 +00:00
Ahmed Belgacem
8f15be169f Fix redundant prev_output_channel assignment in UNet2DModel (#10945) 2025-03-03 11:43:15 -10:00
Yuxuan Zhang
f92e599c70 Update pipeline_cogview4.py (#10944) 2025-03-03 09:42:01 -10:00
Parag Ekbote
982f9b38d6 Add Example of IPAdapterScaleCutoffCallback to Docs (#10934)
* Add example of Ip-Adapter-Callback.

* Add image links from HF Hub.
2025-03-03 08:32:45 -08:00
fancydaddy
c9a219b323 add from_single_file to animatediff (#10924)
* Update pipeline_animatediff.py

* Update pipeline_animatediff_controlnet.py

* Update pipeline_animatediff_sparsectrl.py

* Update pipeline_animatediff_video2video.py

* Update pipeline_animatediff_video2video_controlnet.py

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-03-03 19:11:54 +05:30
Teriks
9e910c4633 Fix SD2.X clip single file load projection_dim (#10770)
* Fix SD2.X clip single file load projection_dim

Infer projection_dim from the checkpoint before loading
from pretrained, override any incorrect hub config.

Hub configuration for SD2.X specifies projection_dim=512
which is incorrect for SD2.X checkpoints loaded from civitai
and similar.

Exception was previously thrown upon attempting to
load_model_dict_into_meta for SD2.X single file checkpoints.

Such LDM models usually require projection_dim=1024

* convert_open_clip_checkpoint use hidden_size for text_proj_dim

* convert_open_clip_checkpoint, revert checkpoint[text_proj_key].shape[1] -> [0]

values are identical

---------

Co-authored-by: Teriks <Teriks@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-03-03 19:00:39 +05:30
Bubbliiiing
5e3b7d2d8a Add EasyAnimateV5.1 text-to-video, image-to-video, control-to-video generation model (#10626)
* Update EasyAnimate V5.1

* Add docs && add tests && Fix comments problems in transformer3d and vae

* delete comments and remove useless import

* delete process

* Update EXAMPLE_DOC_STRING

* rename transformer file

* make fix-copies

* make style

* refactor pt. 1

* update toctree.yml

* add model tests

* Update layer_norm for norm_added_q and norm_added_k in Attention

* Fix processor problem

* refactor vae

* Fix problem in comments

* refactor tiling; remove einops dependency

* fix docs path

* make fix-copies

* Update src/diffusers/pipelines/easyanimate/pipeline_easyanimate_control.py

* update _toctree.yml

* fix test

* update

* update

* update

* make fix-copies

* fix tests

---------

Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-03-03 18:37:19 +05:30
Sayak Paul
7513162b8b [Tests] Remove more encode prompts tests (#10942)
* fix-copies went uncaught it seems.

* remove more unneeded encode_prompt() tests

* Revert "fix-copies went uncaught it seems."

This reverts commit eefb302791.

* empty
2025-03-03 16:55:01 +05:30
Sayak Paul
4aaa0d21ba [chore] fix-copies to flux pipelines (#10941)
fix-copies went uncaught it seems.
2025-03-03 11:21:57 +05:30
hlky
54043c3e2e Update VAE Decode endpoints (#10939) 2025-03-02 18:29:53 +00:00