1
0
mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00

182 Commits

Author SHA1 Message Date
omahs
5b202111bf Fix typos (#12705) 2026-01-10 11:11:15 -08:00
Kashif Rasul
2bb640f8ea [Research] Latent Perceptual Loss (LPL) for Stable Diffusion XL (#11573)
* initial

* added readme

* fix formatting

* added logging

* formatting

* use config

* debug

* better

* handle SNR

* floats have no item()

* remove debug

* formatting

* add paper link

* acknowledge reference source

* rename script

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2026-01-09 10:24:21 -10:00
Yuqian Hong
58519283e7 Support for control-lora (#10686)
* run control-lora on diffusers

* cannot load lora adapter

* test

* 1

* add control-lora

* 1

* 1

* 1

* fix PeftAdapterMixin

* fix module_to_save bug

* delete json print

* resolve conflits

* merged but bug

* change peft.py

* 1

* delete state_dict print

* fix alpha

* Create control_lora.py

* Add files via upload

* rename

* no need modify as peft updated

* add doc

* fix code style

* styling isn't that hard πŸ˜‰

* empty

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-12-15 15:52:42 +05:30
Ali Imran
1b456bd5d5 docs: cleanup of runway model (#12503)
* cleanup of runway model

* quality fixes
2025-10-17 14:10:50 -07:00
Sayak Paul
5e181eddfe Deprecate slicing and tiling methods from DiffusionPipeline (#12271)
* deprecate slicing from flux pipeline.

* propagate.

* tiling

* up

* up
2025-09-11 10:04:35 +05:30
co63oc
764b62473a fix some typos (#12265)
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
2025-09-03 21:28:24 +05:30
sqt
0d1c5b0c3e Fix typo: 'will ge generated' -> 'will be generated' (#12231) 2025-08-25 12:47:52 -07:00
Sayak Paul
c9c8217306 [chore] complete the licensing statement. (#12001)
complete the licensing statement.
2025-08-11 22:15:15 +05:30
Álvaro Somoza
edcbe8038b Fix huggingface-hub failing tests (#11994)
* login

* more logins

* uploads

* missed login

* another missed login

* downloads

* examples and more logins

* fix

* setup

* Apply style fixes

* fix

* Apply style fixes
2025-07-29 02:34:58 -04:00
Aryan
a4df8dbc40 Update more licenses to 2025 (#11746)
update
2025-06-19 07:46:01 +05:30
co63oc
8183d0f16e Fix typos in strings and comments (#11476)
* Fix typos in strings and comments

Signed-off-by: co63oc <co63oc@users.noreply.github.com>

* Update src/diffusers/hooks/hooks.py

Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Update src/diffusers/hooks/hooks.py

Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Update layerwise_casting.py

* Apply style fixes

* update

---------

Signed-off-by: co63oc <co63oc@users.noreply.github.com>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-05-30 18:49:00 +05:30
Quentin GallouΓ©dec
c8bb1ff53e Use HF Papers (#11567)
* Use HF Papers

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-05-19 06:22:33 -10:00
Abdellah Oumida
ddd0cfb497 Fix typo in train_diffusion_orpo_sdxl_lora_wds.py (#11541) 2025-05-12 15:28:29 -10:00
scxue
784db0eaab Add cross attention type for Sana-Sprint training in diffusers. (#11514)
* test permission

* Add cross attention type for Sana-Sprint.

* Add Sana-Sprint training script in diffusers.

* make style && make quality;

* modify the attention processor with `set_attn_processor` and change `SanaAttnProcessor3_0` to `SanaVanillaAttnProcessor`

* Add import for SanaVanillaAttnProcessor

* Add README file.

* Apply suggestions from code review

* style

* Update examples/research_projects/sana/README.md

---------

Co-authored-by: lawrence-cj <cjs1020440147@icloud.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-05-08 18:55:29 +05:30
co63oc
86294d3c7f Fix typos in docs and comments (#11416)
* Fix typos in docs and comments

* Apply style fixes

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-30 20:30:53 -10:00
co63oc
f00a995753 Fix typos in strings and comments (#11407) 2025-04-24 08:53:47 -10:00
Dhruv Nair
edc154da09 Update Ruff to latest Version (#10919)
* update

* update

* update

* update
2025-04-09 16:51:34 +05:30
Kenneth Gerald Hamilton
f10775b1b5 Fixed requests.get function call by adding timeout parameter. (#11156)
* Fixed requests.get function call by adding timeout parameter.

* declare DIFFUSERS_REQUEST_TIMEOUT in constants and import when needed

* remove unneeded os import

* Apply style fixes

---------

Co-authored-by: Sai-Suraj-27 <sai.suraj.27.729@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-04 07:23:14 +01:00
Tolga CangΓΆz
0213179ba8 Update README and example code for AnyText usage (#11028)
* [Documentation] Update README and example code with additional usage instructions for AnyText

* [Documentation] Update README for AnyTextPipeline and improve logging in code

* Remove wget command for font file from example docstring in anytext.py
2025-03-23 21:15:57 +05:30
Yuqian Hong
fc28791fc8 [BUG] Fix Autoencoderkl train script (#11113)
* add disc_optimizer step (not fix)

* support syncbatchnorm in discriminator
2025-03-19 16:49:02 +05:30
Juan Acevedo
27916822b2 update readme instructions. (#11096)
Co-authored-by: Juan Acevedo <jfacevedo@google.com>
2025-03-17 20:07:48 -10:00
Juan Acevedo
6b9a3334db reverts accidental change that removes attn_mask in attn. Improves fl… (#11065)
reverts accidental change that removes attn_mask in attn. Improves flux ptxla by using flash block sizes. Moves encoding outside the for loop.

Co-authored-by: Juan Acevedo <jfacevedo@google.com>
2025-03-14 12:47:01 -10:00
Yaniv Galron
5e48cd27d4 making ``formatted_images`` initialization compact (#10801)
compact writing

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-03-13 09:27:14 -10:00
Tolga CangΓΆz
b88fef4785 [Research Project] Add AnyText: Multilingual Visual Text Generation And Editing (#8998)
* Add initial template

* Second template

* feat: Add TextEmbeddingModule to AnyTextPipeline

* feat: Add AuxiliaryLatentModule template to AnyTextPipeline

* Add bert tokenizer from the anytext repo for now

* feat: Update AnyTextPipeline's modify_prompt method

This commit adds improvements to the modify_prompt method in the AnyTextPipeline class. The method now handles special characters and replaces selected string prompts with a placeholder. Additionally, it includes a check for Chinese text and translation using the trans_pipe.

* Fill in the `forward` pass of `AuxiliaryLatentModule`

* `make style && make quality`

* `chore: Update bert_tokenizer.py with a TODO comment suggesting the use of the transformers library`

* Update error handling to raise and logging

* Add `create_glyph_lines` function into `TextEmbeddingModule`

* make style

* Up

* Up

* Up

* Up

* Remove several comments

* refactor: Remove ControlNetConditioningEmbedding and update code accordingly

* Up

* Up

* up

* refactor: Update AnyTextPipeline to include new optional parameters

* up

* feat: Add OCR model and its components

* chore: Update `TextEmbeddingModule` to include OCR model components and dependencies

* chore: Update `AuxiliaryLatentModule` to include VAE model and its dependencies for masked image in the editing task

* `make style`

* refactor: Update `AnyTextPipeline`'s docstring

* Update `AuxiliaryLatentModule` to include info dictionary so that text processing is done once

* simplify

* `make style`

* Converting `TextEmbeddingModule` to ordinary `encode_prompt()` function

* Simplify for now

* `make style`

* Up

* feat: Add scripts to convert AnyText controlnet to diffusers

* `make style`

* Fix: Move glyph rendering to `TextEmbeddingModule` from `AuxiliaryLatentModule`

* make style

* Up

* Simplify

* Up

* feat: Add safetensors module for loading model file

* Fix device issues

* Up

* Up

* refactor: Simplify

* refactor: Simplify code for loading models and handling data types

* `make style`

* refactor: Update to() method in FrozenCLIPEmbedderT3 and TextEmbeddingModule

* refactor: Update dtype in embedding_manager.py to match proj.weight

* Up

* Add attribution and adaptation information to pipeline_anytext.py

* Update usage example

* Will refactor `controlnet_cond_embedding` initialization

* Add `AnyTextControlNetConditioningEmbedding` template

* Refactor organization

* style

* style

* Move custom blocks from `AuxiliaryLatentModule` to `AnyTextControlNetConditioningEmbedding`

* Follow one-file policy

* style

* [Docs] Update README and pipeline_anytext.py to use AnyTextControlNetModel

* [Docs] Update import statement for AnyTextControlNetModel in pipeline_anytext.py

* [Fix] Update import path for ControlNetModel, ControlNetOutput in anytext_controlnet.py

* Refactor AnyTextControlNet to use configurable conditioning embedding channels

* Complete control net conditioning embedding in AnyTextControlNetModel

* up

* [FIX] Ensure embeddings use correct device in AnyTextControlNetModel

* up

* up

* style

* [UPDATE] Revise README and example code for AnyTextPipeline integration with DiffusionPipeline

* [UPDATE] Update example code in anytext.py to use correct font file and improve clarity

* down

* [UPDATE] Refactor BasicTokenizer usage to a new Checker class for text processing

* update pillow

* [UPDATE] Remove commented-out code and unnecessary docstring in anytext.py and anytext_controlnet.py for improved clarity

* [REMOVE] Delete frozen_clip_embedder_t3.py as it is in the anytext.py file

* [UPDATE] Replace edict with dict for configuration in anytext.py and RecModel.py for consistency

* πŸ†™

* style

* [UPDATE] Revise README.md for clarity, remove unused imports in anytext.py, and add author credits in anytext_controlnet.py

* style

* Update examples/research_projects/anytext/README.md

Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Remove commented-out image preparation code in AnyTextPipeline

* Remove unnecessary blank line in README.md
2025-03-11 01:49:37 +05:30
dependabot[bot]
f103993094 Bump jinja2 from 3.1.5 to 3.1.6 in /examples/research_projects/realfill (#10984)
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.5 to 3.1.6.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/3.1.5...3.1.6)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-06 11:59:51 +00:00
Aryan
c4d4ac21e7 Refactor gradient checkpointing (#10611)
* update

* remove unused fn

* apply suggestions based on review

* update + cleanup 🧹

* more cleanup 🧹

* make fix-copies

* update test
2025-01-28 06:51:46 +05:30
Yuqian Hong
4fa24591a3 create a script to train autoencoderkl (#10605)
* create a script to train vae

* update main.py

* update train_autoencoderkl.py

* update train_autoencoderkl.py

* add a check of --pretrained_model_name_or_path and --model_config_name_or_path

* remove the comment, remove diffusers in requiremnets.txt, add validation_image ote

* update autoencoderkl.py

* quality

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-01-27 16:41:34 +05:30
Sayak Paul
4ace7d0483 [chore] change licensing to 2025 from 2024. (#10615)
change licensing to 2025 from 2024.
2025-01-20 16:57:27 -10:00
baymax591
75a636da48 bugfix for npu not support float64 (#10123)
* bugfix for npu not support float64

* is_mps is_npu

---------

Co-authored-by: η™½θΆ… <baichao19@huawei.com>
Co-authored-by: hlky <hlky@hlky.ac>
2025-01-20 09:35:24 -10:00
Juan Acevedo
aeac0a00f8 implementing flux on TPUs with ptxla (#10515)
* implementing flux on TPUs with ptxla

* add xla flux attention class

* run make style/quality

* Update src/diffusers/models/attention_processor.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/attention_processor.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* run style and quality

---------

Co-authored-by: Juan Acevedo <jfacevedo@google.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-01-16 08:46:02 -10:00
hlky
980736b792 Fix train_dreambooth_lora_sd3_miniature (#10554) 2025-01-13 13:47:27 +00:00
hlky
ee7e141d80 Use pipelines without vae (#10441)
* Use pipelines without vae

* getattr

* vqvae

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-01-07 13:26:51 -10:00
dependabot[bot]
e0b96ba7b0 Bump jinja2 from 3.1.4 to 3.1.5 in /examples/research_projects/realfill (#10377)
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.4 to 3.1.5.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/3.1.4...3.1.5)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-07 19:59:41 +05:30
Rahul Raman
f1e0c7ce4a Refactor instructpix2pix lora to support peft (#10205)
* make base code changes referred from train_instructpix2pix script in examples

* change code to use PEFT as discussed in issue 10062

* update README training command

* update README training command

* refactor variable name and freezing unet

* Update examples/research_projects/instructpix2pix_lora/train_instruct_pix2pix_lora.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update README installation instructions.

* cleanup code using make style and quality

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-01-07 12:00:45 +05:30
Juan Acevedo
3cb7b8628c Update ptxla training (#9864)
* update ptxla example

---------

Co-authored-by: Juan Acevedo <jfacevedo@google.com>
Co-authored-by: Pei Zhang <zpcore@gmail.com>
Co-authored-by: Pei Zhang <piz@google.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Pei Zhang <pei@Peis-MacBook-Pro.local>
Co-authored-by: hlky <hlky@hlky.ac>
2024-12-06 10:50:13 -10:00
Parag Ekbote
cc7d88f247 Move IP Adapter Scripts to research project (#9960)
* Move files to research-projects.

* docs: add IP Adapter training instructions

* Delete venv

* Update examples/ip_adapter/tutorial_train_sdxl.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Cherry-picked commits and re-moved files
to research_projects.

* make style.

* Update toctree and delete ip_adapter.

* Nit Fix

* Fix nit.

* Fix nit.

* Create training script for single GPU and set
model format to .safetensors

* Add sample inference script and restore _toctree

* Restore toctree.yaml

* fix spacing.

* Update toctree.yaml

---------

Co-authored-by: AMohamedAakhil <a.aakhilmohamed@gmail.com>
Co-authored-by: BootesVoid <78485654+AMohamedAakhil@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-11-19 10:37:22 -08:00
Parag Ekbote
e255920719 Move Wuerstchen Dreambooth to research_projects (#9935)
update file paths to research_projects folder.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-11-16 18:56:16 +05:30
Michael Tkachuk
5b972fbd6a Enabling gradient checkpointing in eval() mode (#9878)
* refactored
2024-11-08 09:03:26 -10:00
Sayak Paul
ded3db164b [Core] introduce controlnet module (#8768)
* move vae flax module.

* controlnet module.

* prepare for PR.

* revert a commit

* gracefully deprecate controlnet deps.

* fix

* fix doc path

* fix-copies

* fix path

* style

* style

* conflicts

* fix

* fix-copies

* sparsectrl.

* updates

* fix

* updates

* updates

* updates

* fix

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-11-06 22:08:55 -04:00
Sayak Paul
8ce37ab055 [training] use the lr when using 8bit adam. (#9796)
* use the lr when using 8bit adam.

* remove lr as we pack it in params_to_optimize.

---------

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2024-10-31 15:51:42 +05:30
Sayak Paul
09b8aebd67 [training] fixes to the quantization training script and add AdEMAMix optimizer as an option (#9806)
* fixes

* more fixes.
2024-10-31 15:46:00 +05:30
Raul Ciotescu
c5376c5695 adds the pipeline for pixart alpha controlnet (#8857)
* add the controlnet pipeline for pixart alpha

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: junsongc <cjs1020440147@icloud.com>
2024-10-28 08:48:04 -10:00
Sayak Paul
fddbab7993 [research_projects] Update README.md to include a note about NF5 T5-xxl (#9775)
Update README.md
2024-10-26 22:13:03 +09:00
Sayak Paul
df073ba137 [research_projects] add flux training script with quantization (#9754)
* add flux training script with quantization

* remove exclamation
2024-10-26 00:07:57 +09:00
GSSun
164ec9f423 fix IsADirectoryError when running the training code for sd3_dreambooth_lora_16gb.ipynb (#9634)
Add files via upload

fix IsADirectoryError when running the training code
2024-10-11 13:33:39 +05:30
suzukimain
b52119ae92 [docs] Replace runwayml/stable-diffusion-v1-5 with Lykon/dreamshaper-8 (#9428)
* [docs] Replace runwayml/stable-diffusion-v1-5 with Lykon/dreamshaper-8

Updated documentation as runwayml/stable-diffusion-v1-5 has been removed from Huggingface.

* Update docs/source/en/using-diffusers/inpaint.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Replace with stable-diffusion-v1-5/stable-diffusion-v1-5

* Update inpaint.md

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-09-16 10:18:45 -07:00
Juan Acevedo
45aa8bb187 Ptxla sd training (#9381)
* enable pxla training of stable diffusion 2.x models.

* run linter/style and run pipeline test for stable diffusion and fix issues.

* update xla libraries

* fix read me newline.

* move files to research folder.

* update per comments.

* rename readme.

---------

Co-authored-by: Juan Acevedo <jfacevedo@google.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-09-12 08:35:06 +05:30
dependabot[bot]
0c1e63bd11 Bump jinja2 from 3.1.3 to 3.1.4 in /examples/research_projects/realfill (#7873)
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.3 to 3.1.4.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/3.1.3...3.1.4)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-08-14 16:36:59 +05:30
dependabot[bot]
e7e45bd127 Bump torch from 2.0.1 to 2.2.0 in /examples/research_projects/realfill (#8971)
Bumps [torch](https://github.com/pytorch/pytorch) from 2.0.1 to 2.2.0.
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md)
- [Commits](https://github.com/pytorch/pytorch/compare/v2.0.1...v2.2.0)

---
updated-dependencies:
- dependency-name: torch
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-08-14 16:36:46 +05:30
Tolga CangΓΆz
3dc97bd148 Update CLIPFeatureExtractor to CLIPImageProcessor and DPTFeatureExtractor to DPTImageProcessor (#9002)
* fix: update `CLIPFeatureExtractor` to `CLIPImageProcessor` in codebase

* `make style && make quality`

* Update `DPTFeatureExtractor` to `DPTImageProcessor` in codebase

* `make style`

---------

Co-authored-by: Aryan <aryan@huggingface.co>
2024-08-05 09:20:29 -10:00