1
0
mirror of https://github.com/huggingface/diffusers.git synced 2026-01-29 07:22:12 +03:00
Commit Graph

5939 Commits

Author SHA1 Message Date
sayakpaul
7d8c20b94e Merge branch 'main' into feat/autoencodermixin 2025-10-21 17:25:48 -10:00
Sayak Paul
a5a0ccf86a [core] AutoencoderMixin to abstract common methods (#12473)
* up

* correct wording.

* up

* up

* up
2025-10-22 08:52:06 +05:30
Sayak Paul
234ffa4cbe Merge branch 'main' into feat/autoencodermixin 2025-10-22 06:20:15 +05:30
sayakpaul
99bc6649d0 up 2025-10-21 14:49:50 -10:00
David Bertoin
dd07b19e27 Prx (#12525)
* rename photon to prx

* rename photon into prx

* Revert .gitignore to state before commit b7fb0fe9d6

* rename photon to prx

* rename photon into prx

* Revert .gitignore to state before commit b7fb0fe9d6

* make fix-copies
2025-10-21 17:09:22 -07:00
Sayak Paul
cc65d0a09c Merge branch 'main' into feat/autoencodermixin 2025-10-22 02:17:28 +05:30
vb
57636ad4f4 purge HF_HUB_ENABLE_HF_TRANSFER; promote Xet (#12497)
* purge HF_HUB_ENABLE_HF_TRANSFER; promote Xet

* purge HF_HUB_ENABLE_HF_TRANSFER; promote Xet x2

* restrict docker build test to the ones we actually use in CI.

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-10-22 00:59:20 +05:30
David Bertoin
cefc2cf82d Add Photon model and pipeline support (#12456)
* Add Photon model and pipeline support

This commit adds support for the Photon image generation model:
- PhotonTransformer2DModel: Core transformer architecture
- PhotonPipeline: Text-to-image generation pipeline
- Attention processor updates for Photon-specific attention mechanism
- Conversion script for loading Photon checkpoints
- Documentation and tests

* just store the T5Gemma encoder

* enhance_vae_properties if vae is provided only

* remove autocast for text encoder forwad

* BF16 example

* conditioned CFG

* remove enhance vae and use vae.config directly when possible

* move PhotonAttnProcessor2_0 in transformer_photon

* remove einops dependency and now inherits from AttentionMixin

* unify the structure of the forward block

* update doc

* update doc

* fix T5Gemma loading from hub

* fix timestep shift

* remove lora support from doc

* Rename EmbedND for PhotoEmbedND

* remove modulation dataclass

* put _attn_forward and _ffn_forward logic in PhotonBlock's forward

* renam LastLayer for FinalLayer

* remove lora related code

* rename vae_spatial_compression_ratio for vae_scale_factor

* support prompt_embeds in call

* move xattention conditionning out computation out of the denoising loop

* add negative prompts

* Use _import_structure for lazy loading

* make quality + style

* add pipeline test + corresponding fixes

* utility function that determines the default resolution given the VAE

* Refactor PhotonAttention to match Flux pattern

* built-in RMSNorm

* Revert accidental .gitignore change

* parameter names match the standard diffusers conventions

* renaming and remove unecessary attributes setting

* Update docs/source/en/api/pipelines/photon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* quantization example

* added doc to toctree

* Update docs/source/en/api/pipelines/photon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/photon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/photon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* use dispatch_attention_fn for multiple attention backend support

* naming changes

* make fix copy

* Update docs/source/en/api/pipelines/photon.md

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

* Add PhotonTransformer2DModel to TYPE_CHECKING imports

* make fix-copies

* Use Tuple instead of tuple

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

* restrict the version of transformers

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

* Update tests/pipelines/photon/test_pipeline_photon.py

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

* Update tests/pipelines/photon/test_pipeline_photon.py

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

* change | for Optional

* fix nits.

* use typing Dict

---------

Co-authored-by: davidb <davidb@worker-10.soperator-worker-svc.soperator.svc.cluster.local>
Co-authored-by: David Briand <david@photoroom.com>
Co-authored-by: davidb <davidb@worker-8.soperator-worker-svc.soperator.svc.cluster.local>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
2025-10-21 20:55:55 +05:30
Sayak Paul
2f2b325952 Merge branch 'main' into feat/autoencodermixin 2025-10-21 20:05:32 +05:30
Sayak Paul
b3e56e71fb styling issues. (#12522) 2025-10-21 20:04:54 +05:30
Sayak Paul
72ce8edf2e Merge branch 'main' into feat/autoencodermixin 2025-10-21 11:53:41 +05:30
sayakpaul
231b316fc0 up 2025-10-20 20:19:07 -10:00
sayakpaul
5d30c5bd00 Merge branch 'main' into feat/autoencodermixin 2025-10-20 20:17:13 -10:00
Steven Liu
5b5fa49a89 [docs] Organize toctree by modality (#12514)
* reorganize

* fix

---------

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
2025-10-21 10:18:54 +05:30
Fei Xie
decfa3c9e1 Fix: Use incorrect temporary variable key when replacing adapter name… (#12502)
Fix: Use incorrect temporary variable key when replacing adapter name in state dict within load_lora_adapter function

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-10-20 15:45:37 -10:00
Dhruv Nair
48305755bf Raise warning instead of error when imports are missing for custom code (#12513)
update
2025-10-20 07:02:23 -10:00
dg845
7853bfbed7 Remove Qwen Image Redundant RoPE Cache (#12452)
Refactor QwenEmbedRope to only use the LRU cache for RoPE caching
2025-10-19 18:41:58 -07:00
sayakpaul
18054507b9 up 2025-10-19 09:31:36 -10:00
Sayak Paul
a3d205c54a Merge branch 'main' into feat/autoencodermixin 2025-10-20 00:35:34 +05:30
Lev Novitskiy
23ebbb4bc8 Kandinsky 5 is finally in Diffusers! (#12478)
* add kandinsky5 transformer pipeline first version

---------

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Charles <charles@huggingface.co>
2025-10-17 18:34:30 -10:00
Ali Imran
1b456bd5d5 docs: cleanup of runway model (#12503)
* cleanup of runway model

* quality fixes
2025-10-17 14:10:50 -07:00
Sayak Paul
af769881d3 [tests] introduce VAETesterMixin to consolidate tests for slicing and tiling (#12374)
* up

* up

* up

* up

* up

* u[

* up

* up

* up
2025-10-17 12:02:29 +05:30
Sayak Paul
4715c5c769 [ci] xfail more incorrect transformer imports. (#12455)
* xfail more incorrect transformer imports.

* xfail more.

* up

* up

* up
2025-10-17 10:35:19 +05:30
Sayak Paul
44c711c240 Merge branch 'main' into feat/autoencodermixin 2025-10-17 07:55:02 +05:30
Steven Liu
dbe413668d [CI] Check links (#12491)
* check links

* update

* feedback

* remove
2025-10-16 10:38:16 -07:00
Steven Liu
26475082cb [docs] Attention checks (#12486)
* checks

* feedback

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-10-16 09:19:30 -07:00
YiYi Xu
f072c64bf2 ltx0.9.8 (without IC lora, autoregressive sampling) (#12493)
update

Co-authored-by: Aryan <aryan@huggingface.co>
2025-10-15 07:41:17 -10:00
Sayak Paul
aed636f5f0 [tests] fix clapconfig for text backbone in audioldm2 (#12490)
fix clapconfig for text backbone in audioldm2
2025-10-15 10:57:09 +05:30
Sayak Paul
53a10518b9 remove unneeded checkpoint imports. (#12488) 2025-10-15 09:51:18 +05:30
Steven Liu
b4e6dc3037 [docs] Fix broken links (#12487)
fix broken links
2025-10-15 06:42:10 +05:30
Steven Liu
3eb40786ca [docs] Prompting (#12312)
* init

* fix

* batch inf

* feedback

* update
2025-10-14 13:53:56 -07:00
Meatfucker
a4bc845478 Fix missing load_video documentation and load_video import in WanVideoToVideoPipeline example code (#12472)
* Update utilities.md

Update missing load_video documentation

* Update pipeline_wan_video2video.py

Fix missing load_video import in example code
2025-10-14 10:43:21 -07:00
Manith Ratnayake
fa468c5d57 docs: api-pipelines-qwenimage typo fix (#12461) 2025-10-13 08:57:46 -07:00
sayakpaul
4c9db16874 correct wording. 2025-10-13 10:43:58 +05:30
sayakpaul
71f24e36de up 2025-10-13 10:22:01 +05:30
Steven Liu
8abc7aeb71 [docs] Fix syntax (#12464)
* fix syntax

* fix

* style

* fix
2025-10-11 08:13:30 +05:30
Sayak Paul
693d8a3a52 [modular] i2i and t2i support for kontext modular (#12454)
* up

* get ready

* fix import

* up

* up
2025-10-10 18:10:17 +05:30
Sayak Paul
a9df12ab45 Update Dockerfile to include zip wget for doc-builder (#12451) 2025-10-09 15:25:03 +05:30
Sayak Paul
a519272d97 [ci] revisit the installations in CI. (#12450)
* revisit the installations in CI.

* up

* up

* up

* empty

* up

* up

* up
2025-10-08 19:21:24 +05:30
Sayak Paul
345864eb85 fix more torch.distributed imports (#12425)
* up

* unguard.
2025-10-08 10:45:39 +05:30
Sayak Paul
35e538d46a fix dockerfile definitions. (#12424)
* fix dockerfile definitions.

* python 3.10 slim.

* up

* up

* up

* up

* up

* revert pr_tests.yml changes

* up

* up

* reduce python version for torch 2.1.0
2025-10-08 09:46:18 +05:30
Sayak Paul
2dc31677e1 Align Flux modular more and more with Qwen modular (#12445)
* start

* fix

* up
2025-10-08 09:22:34 +05:30
Linoy Tsaban
1066de8c69 [Qwen LoRA training] fix bug when offloading (#12440)
* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled
2025-10-07 18:27:15 +03:00
Sayak Paul
2d69bacb00 handle offload_state_dict when initing transformers models (#12438) 2025-10-07 13:51:20 +05:30
Changseop Yeom
0974b4c606 [i18n-KO] Fix typo and update translation in ethical_guidelines.md (#12435) 2025-10-06 14:24:05 -07:00
Charles
cf4b97b233 [perf] Cache version checks (#12399) 2025-10-06 17:45:34 +02:00
Sayak Paul
7f3e9b8695 make flux ready for mellon (#12419)
* make flux ready for mellon

* up

* Apply suggestions from code review

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>

---------

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
2025-10-06 13:15:54 +05:30
SahilCarterr
ce90f9b2db [FIX] Text to image training peft version (#12434)
Fix peft error
2025-10-06 08:24:54 +05:30
Sayak Paul
c3675d4c9b [core] support QwenImage Edit Plus in modular (#12416)
* up

* up

* up

* up

* up

* up

* remove saves

* move things around a bit.

* get ready.
2025-10-05 21:57:13 +05:30
Vladimir Mandic
2b7deffe36 fix scale_shift_factor being on cpu for wan and ltx (#12347)
* wan fix scale_shift_factor being on cpu

* apply device cast to ltx transformer

* Apply style fixes

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-10-05 09:23:38 +05:30