1
0
mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00
Commit Graph

5947 Commits

Author SHA1 Message Date
DN6
eb6d907b14 update 2025-10-24 18:11:07 +05:30
DN6
e86248029d update 2025-10-24 17:32:16 +05:30
DN6
0cee3621b6 update 2025-10-24 11:18:08 +05:30
DN6
1673ece773 update 2025-10-24 10:50:13 +05:30
DN6
208b955b9c update 2025-10-24 10:27:53 +05:30
DN6
1d9dfd2c19 update 2025-10-24 10:07:23 +05:30
DN6
1f217a5440 update 2025-10-24 10:04:34 +05:30
DN6
7f0942abb4 Merge branch 'main' into pr-test-speed-up 2025-10-24 10:03:19 +05:30
Dhruv Nair
9c3b58dcf1 Handle deprecated transformer classes (#12517)
* update

* update

* update
2025-10-23 16:22:07 +05:30
Aishwarya Badlani
74b5fed434 Fix MPS compatibility in get_1d_sincos_pos_embed_from_grid #12432 (#12449)
* Fix MPS compatibility in get_1d_sincos_pos_embed_from_grid #12432

* Fix trailing whitespace in docstring

* Apply style fixes

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-10-23 16:18:07 +05:30
kaixuanliu
85eb505672 fix CI bug for kandinsky3_img2img case (#12474)
* fix CI bug for kandinsky3_img2img case

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* update code

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

---------

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
2025-10-23 16:17:22 +05:30
Sayak Paul
ccdd96ca52 [tests] Test attention backends (#12388)
* add a lightweight test suite for attention backends.

* up

* up

* Apply suggestions from code review

* formatting
2025-10-23 15:09:41 +05:30
Sayak Paul
4c723d8ec3 [CI] xfail the test_wuerstchen_prior test (#12530)
xfail the test_wuerstchen_prior test
2025-10-22 08:45:47 -10:00
YiYi Xu
bec2d8eaea Fix: Add _skip_keys for AutoencoderKLWan (#12523)
add
2025-10-22 07:53:13 -10:00
Álvaro Somoza
a0a51eb098 Kandinsky5 No cfg fix (#12527)
fix
2025-10-22 22:02:47 +05:30
Sayak Paul
a5a0ccf86a [core] AutoencoderMixin to abstract common methods (#12473)
* up

* correct wording.

* up

* up

* up
2025-10-22 08:52:06 +05:30
David Bertoin
dd07b19e27 Prx (#12525)
* rename photon to prx

* rename photon into prx

* Revert .gitignore to state before commit b7fb0fe9d6

* rename photon to prx

* rename photon into prx

* Revert .gitignore to state before commit b7fb0fe9d6

* make fix-copies
2025-10-21 17:09:22 -07:00
vb
57636ad4f4 purge HF_HUB_ENABLE_HF_TRANSFER; promote Xet (#12497)
* purge HF_HUB_ENABLE_HF_TRANSFER; promote Xet

* purge HF_HUB_ENABLE_HF_TRANSFER; promote Xet x2

* restrict docker build test to the ones we actually use in CI.

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-10-22 00:59:20 +05:30
David Bertoin
cefc2cf82d Add Photon model and pipeline support (#12456)
* Add Photon model and pipeline support

This commit adds support for the Photon image generation model:
- PhotonTransformer2DModel: Core transformer architecture
- PhotonPipeline: Text-to-image generation pipeline
- Attention processor updates for Photon-specific attention mechanism
- Conversion script for loading Photon checkpoints
- Documentation and tests

* just store the T5Gemma encoder

* enhance_vae_properties if vae is provided only

* remove autocast for text encoder forwad

* BF16 example

* conditioned CFG

* remove enhance vae and use vae.config directly when possible

* move PhotonAttnProcessor2_0 in transformer_photon

* remove einops dependency and now inherits from AttentionMixin

* unify the structure of the forward block

* update doc

* update doc

* fix T5Gemma loading from hub

* fix timestep shift

* remove lora support from doc

* Rename EmbedND for PhotoEmbedND

* remove modulation dataclass

* put _attn_forward and _ffn_forward logic in PhotonBlock's forward

* renam LastLayer for FinalLayer

* remove lora related code

* rename vae_spatial_compression_ratio for vae_scale_factor

* support prompt_embeds in call

* move xattention conditionning out computation out of the denoising loop

* add negative prompts

* Use _import_structure for lazy loading

* make quality + style

* add pipeline test + corresponding fixes

* utility function that determines the default resolution given the VAE

* Refactor PhotonAttention to match Flux pattern

* built-in RMSNorm

* Revert accidental .gitignore change

* parameter names match the standard diffusers conventions

* renaming and remove unecessary attributes setting

* Update docs/source/en/api/pipelines/photon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* quantization example

* added doc to toctree

* Update docs/source/en/api/pipelines/photon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/photon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/photon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* use dispatch_attention_fn for multiple attention backend support

* naming changes

* make fix copy

* Update docs/source/en/api/pipelines/photon.md

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

* Add PhotonTransformer2DModel to TYPE_CHECKING imports

* make fix-copies

* Use Tuple instead of tuple

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

* restrict the version of transformers

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

* Update tests/pipelines/photon/test_pipeline_photon.py

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

* Update tests/pipelines/photon/test_pipeline_photon.py

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

* change | for Optional

* fix nits.

* use typing Dict

---------

Co-authored-by: davidb <davidb@worker-10.soperator-worker-svc.soperator.svc.cluster.local>
Co-authored-by: David Briand <david@photoroom.com>
Co-authored-by: davidb <davidb@worker-8.soperator-worker-svc.soperator.svc.cluster.local>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
2025-10-21 20:55:55 +05:30
Sayak Paul
b3e56e71fb styling issues. (#12522) 2025-10-21 20:04:54 +05:30
Steven Liu
5b5fa49a89 [docs] Organize toctree by modality (#12514)
* reorganize

* fix

---------

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
2025-10-21 10:18:54 +05:30
Fei Xie
decfa3c9e1 Fix: Use incorrect temporary variable key when replacing adapter name… (#12502)
Fix: Use incorrect temporary variable key when replacing adapter name in state dict within load_lora_adapter function

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-10-20 15:45:37 -10:00
Dhruv Nair
48305755bf Raise warning instead of error when imports are missing for custom code (#12513)
update
2025-10-20 07:02:23 -10:00
dg845
7853bfbed7 Remove Qwen Image Redundant RoPE Cache (#12452)
Refactor QwenEmbedRope to only use the LRU cache for RoPE caching
2025-10-19 18:41:58 -07:00
Lev Novitskiy
23ebbb4bc8 Kandinsky 5 is finally in Diffusers! (#12478)
* add kandinsky5 transformer pipeline first version

---------

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Charles <charles@huggingface.co>
2025-10-17 18:34:30 -10:00
Ali Imran
1b456bd5d5 docs: cleanup of runway model (#12503)
* cleanup of runway model

* quality fixes
2025-10-17 14:10:50 -07:00
Sayak Paul
af769881d3 [tests] introduce VAETesterMixin to consolidate tests for slicing and tiling (#12374)
* up

* up

* up

* up

* up

* u[

* up

* up

* up
2025-10-17 12:02:29 +05:30
Sayak Paul
4715c5c769 [ci] xfail more incorrect transformer imports. (#12455)
* xfail more incorrect transformer imports.

* xfail more.

* up

* up

* up
2025-10-17 10:35:19 +05:30
Steven Liu
dbe413668d [CI] Check links (#12491)
* check links

* update

* feedback

* remove
2025-10-16 10:38:16 -07:00
Steven Liu
26475082cb [docs] Attention checks (#12486)
* checks

* feedback

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-10-16 09:19:30 -07:00
YiYi Xu
f072c64bf2 ltx0.9.8 (without IC lora, autoregressive sampling) (#12493)
update

Co-authored-by: Aryan <aryan@huggingface.co>
2025-10-15 07:41:17 -10:00
Sayak Paul
aed636f5f0 [tests] fix clapconfig for text backbone in audioldm2 (#12490)
fix clapconfig for text backbone in audioldm2
2025-10-15 10:57:09 +05:30
Sayak Paul
53a10518b9 remove unneeded checkpoint imports. (#12488) 2025-10-15 09:51:18 +05:30
Steven Liu
b4e6dc3037 [docs] Fix broken links (#12487)
fix broken links
2025-10-15 06:42:10 +05:30
Steven Liu
3eb40786ca [docs] Prompting (#12312)
* init

* fix

* batch inf

* feedback

* update
2025-10-14 13:53:56 -07:00
Meatfucker
a4bc845478 Fix missing load_video documentation and load_video import in WanVideoToVideoPipeline example code (#12472)
* Update utilities.md

Update missing load_video documentation

* Update pipeline_wan_video2video.py

Fix missing load_video import in example code
2025-10-14 10:43:21 -07:00
Manith Ratnayake
fa468c5d57 docs: api-pipelines-qwenimage typo fix (#12461) 2025-10-13 08:57:46 -07:00
Steven Liu
8abc7aeb71 [docs] Fix syntax (#12464)
* fix syntax

* fix

* style

* fix
2025-10-11 08:13:30 +05:30
Sayak Paul
693d8a3a52 [modular] i2i and t2i support for kontext modular (#12454)
* up

* get ready

* fix import

* up

* up
2025-10-10 18:10:17 +05:30
Sayak Paul
a9df12ab45 Update Dockerfile to include zip wget for doc-builder (#12451) 2025-10-09 15:25:03 +05:30
Sayak Paul
a519272d97 [ci] revisit the installations in CI. (#12450)
* revisit the installations in CI.

* up

* up

* up

* empty

* up

* up

* up
2025-10-08 19:21:24 +05:30
Sayak Paul
345864eb85 fix more torch.distributed imports (#12425)
* up

* unguard.
2025-10-08 10:45:39 +05:30
Sayak Paul
35e538d46a fix dockerfile definitions. (#12424)
* fix dockerfile definitions.

* python 3.10 slim.

* up

* up

* up

* up

* up

* revert pr_tests.yml changes

* up

* up

* reduce python version for torch 2.1.0
2025-10-08 09:46:18 +05:30
Sayak Paul
2dc31677e1 Align Flux modular more and more with Qwen modular (#12445)
* start

* fix

* up
2025-10-08 09:22:34 +05:30
Linoy Tsaban
1066de8c69 [Qwen LoRA training] fix bug when offloading (#12440)
* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled
2025-10-07 18:27:15 +03:00
Sayak Paul
2d69bacb00 handle offload_state_dict when initing transformers models (#12438) 2025-10-07 13:51:20 +05:30
Changseop Yeom
0974b4c606 [i18n-KO] Fix typo and update translation in ethical_guidelines.md (#12435) 2025-10-06 14:24:05 -07:00
Charles
cf4b97b233 [perf] Cache version checks (#12399) 2025-10-06 17:45:34 +02:00
Sayak Paul
7f3e9b8695 make flux ready for mellon (#12419)
* make flux ready for mellon

* up

* Apply suggestions from code review

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>

---------

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
2025-10-06 13:15:54 +05:30
SahilCarterr
ce90f9b2db [FIX] Text to image training peft version (#12434)
Fix peft error
2025-10-06 08:24:54 +05:30