1
0
mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00

6187 Commits

Author SHA1 Message Date
RuoyiDu
f6b6a7181e Add z-image-omni-base implementation (#12857)
* Add z-image-omni-base implementation

* Merged into one transformer for Z-Image.

* Fix bugs for controlnet after merging the main branch new feature.

* Fix for auto_pipeline, Add Styling.

* Refactor noise handling and modulation

- Add select_per_token function for per-token value selection
- Separate adaptive modulation logic
- Cleanify t_noisy/clean variable naming
- Move image_noise_mask handler from forward to pipeline

* Styling & Formatting.

* Rewrite code with more non-forward func & clean forward.

1.Change to one forward with shorter code with omni code (None).
2.Split out non-forward funcs: _build_unified_sequence, _prepare_sequence, patchify, pad.

* Styling & Formatting.

* Manual check fix-copies in controlnet, Add select_per_token, _patchify_image, _pad_with_ids; Styling.

* Add Import in pipeline __init__.py.

---------

Co-authored-by: Jerry Qilong Wu <xinglong.wql@alibaba-inc.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-12-23 23:45:35 -10:00
Alvaro Bartolome
52766e6a69 Use T5Tokenizer instead of MT5Tokenizer (removed in Transformers v5.0+) (#12877)
Use `T5Tokenizer` instead of `MT5Tokenizer`

Given that the `MT5Tokenizer` in `transformers` is just a "re-export" of
`T5Tokenizer` as per
https://github.com/huggingface/transformers/blob/v4.57.3/src/transformers/models/mt5/tokenization_mt5.py
)on latest available stable Transformers i.e., v4.57.3), this commit
updates the imports to point to `T5Tokenizer` instead, so that those
still work with Transformers v5.0.0rc0 onwards.
2025-12-23 06:57:41 -10:00
Miguel Martin
973a077c6a Cosmos Predict2.5 14b Conversion (#12863)
14b conversion
2025-12-22 08:02:06 -10:00
Alvaro Bartolome
0c4f6c9cff Add OvisImagePipeline in AUTO_TEXT2IMAGE_PIPELINES_MAPPING (#12876) 2025-12-22 07:14:03 -10:00
MatrixTeam-AI
262ce19bff Feature: Add Mambo-G Guidance as Guider (#12862)
* Feature: Add Mambo-G Guidance to Qwen-Image Pipeline

* change to guider implementation

* fix copied code residual

* Update src/diffusers/guiders/magnitude_aware_guidance.py

* Apply style fixes

---------

Co-authored-by: Pscgylotti <pscgylotti@github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-12-19 13:10:40 -10:00
YiYi Xu
f7753b1bc8 more update in modular (#12560)
* move node registry to mellon

* up

* fix

* modula rpipeline update: filter out none for input_names, fix default blocks for pipe.init() and allow user pass additional kwargs_type in a dict

* qwen modular refactor, unpack before decode

* update mellon node config, adding* to required_inputs and required_model_inputs

* modularpipeline.from_pretrained: error out if no config found

* add a component_names property to modular blocks to be consistent!

* flux image_encoder -> vae_encoder

* controlnet_bundle

* refator MellonNodeConfig MellonPipelineConfig

* refactor & simplify mellon utils

* vae_image_encoder -> vae_encoder

* mellon config save keep key order

* style + copies

* add kwargs input for zimage
2025-12-18 19:25:20 -10:00
Miguel Martin
b5309683cb Cosmos Predict2.5 Base: inference pipeline, scheduler & chkpt conversion (#12852)
* cosmos predict2.5 base: convert chkpt & pipeline
- New scheduler: scheduling_flow_unipc_multistep.py
- Changes to TransformerCosmos for text embeddings via crossattn_proj

* scheduler cleanup

* simplify inference pipeline

* cleanup scheduler + tests

* Basic tests for flow unipc

* working b2b inference

* Rename everything

* Tests for pipeline present, but not working (predict2 also not working)

* docstring update

* wrapper pipelines + make style

* remove unnecessary files

* UniPCMultistep: support use_karras_sigmas=True and use_flow_sigmas=True

* use UniPCMultistepScheduler + fix tests for pipeline

* Remove FlowUniPCMultistepScheduler

* UniPCMultistepScheduler for use_flow_sigmas=True & use_karras_sigmas=True

* num_inference_steps=36 due to bug in scheduler used by predict2.5

* Address comments

* make style + make fix-copies

* fix tests + remove references to old pipelines

* address comments

* add revision in from_pretrained call

* fix tests
2025-12-19 05:38:18 +05:30
hlky
55463f7ace Z-Image-Turbo ControlNet (#12792)
* init

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-12-17 09:44:20 -10:00
naykun
f9c1e612fb Qwen Image Layered Support (#12853)
* [qwen-image] qwen image layered support

* [qwen-image] update doc

* [qwen-image] fix pr comments

* Apply style fixes

* make fix-copies

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-12-17 16:57:57 +05:30
Wang, Yi
87f7d11143 extend TorchAoTest::test_model_memory_usage to other platform (#12768)
* extend TorchAoTest::test_model_memory_usage to other platform

Signe-off-by: Wang, Yi <yi.a.wang@inel.com>

* add some comments

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

---------

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
2025-12-17 13:44:08 +05:30
junqiangwu
5e48f466b9 fix the prefix_token_len bug (#12845) 2025-12-15 22:02:25 -10:00
junqiangwu
a748a839ad Add support for LongCat-Image (#12828)
* Add  LongCat-Image

* Update src/diffusers/models/transformers/transformer_longcat_image.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/transformers/transformer_longcat_image.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/transformers/transformer_longcat_image.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/transformers/transformer_longcat_image.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* fix code

* add doc

* Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image_edit.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image_edit.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* fix code & mask style & fix-copies

* Apply style fixes

* fix single input rewrite error

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: hadoop-imagen <hadoop-imagen@psxfb7pxrbvmh3oq-worker-0.psxfb7pxrbvmh3oq.hadoop-aipnlp.svc.cluster.local>
2025-12-15 07:45:17 -10:00
Yuqian Hong
58519283e7 Support for control-lora (#10686)
* run control-lora on diffusers

* cannot load lora adapter

* test

* 1

* add control-lora

* 1

* 1

* 1

* fix PeftAdapterMixin

* fix module_to_save bug

* delete json print

* resolve conflits

* merged but bug

* change peft.py

* 1

* delete state_dict print

* fix alpha

* Create control_lora.py

* Add files via upload

* rename

* no need modify as peft updated

* add doc

* fix code style

* styling isn't that hard 😉

* empty

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-12-15 15:52:42 +05:30
Wang, Yi
0c1ccc0775 fix pytest tests/pipelines/pixart_sigma/test_pixart.py::PixArtSigmaPi… (#12842)
fix pytest tests/pipelines/pixart_sigma/test_pixart.py::PixArtSigmaPipelineIntegrationTests::test_pixart_512 in xpu

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-12-15 14:36:01 +05:30
naykun
b8a4cbac14 [qwen-image] edit 2511 support (#12839)
* [qwen-image] edit 2511 support

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-12-15 12:35:01 +05:30
Wang, Yi
17c0e79dbd support CP in native flash attention (#12829)
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-12-12 13:18:39 +05:30
Sayak Paul
1567243463 [lora] Remove lora docs unneeded and add " # Copied from ..." (#12824)
* remove unneeded docs on load_lora_weights().

* remove more.

* up[

* up

* up
2025-12-12 08:31:27 +05:30
Sayak Paul
0eac64c7a6 Update distributed_inference.md to correct syntax (#12827) 2025-12-11 08:46:43 -08:00
Sayak Paul
10e820a2dd post release 0.36.0 (#12804)
* post release 0.36.0

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-12-11 22:01:59 +05:30
Sayak Paul
6708f5c76d [docs] improve distributed inference cp docs. (#12810)
* improve distributed inference cp docs.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-12-10 08:25:07 -08:00
Dhruv Nair
be3c2a0667 [WIP] Add Flux2 modular (#12763)
* update

* update

* update

* update

* update

* update

* update

* update

* update

* update
2025-12-10 12:19:07 +05:30
Sayak Paul
8b4722de57 Fix Qwen Edit Plus modular for multi-image input (#12601)
* try to fix qwen edit plus multi images (modular)

* up

* up

* test

* up

* up
2025-12-09 10:08:30 -10:00
YiYi Xu
07ea0786e8 [Modular]z-image (#12808)
* initiL

* up up

* fix: z_image -> z-image

* style

* copy

* fix more

* some docstring fix
2025-12-09 08:08:41 -10:00
David El Malih
54fa0745c3 Improve docstrings and type hints in scheduling_dpmsolver_singlestep.py (#12798)
feat: add flow sigmas, dynamic shifting, and refine type hints in DPMSolverSinglestepScheduler
2025-12-08 08:58:57 -08:00
David Lacalle Castillo
3d02cd543e [PRX] Improve model compilation (#12787)
* Reimplement img2seq & seq2img in PRX to enable ONNX build without Col2Im (incompatible with TensorRT).

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-12-08 17:42:17 +05:30
CalamitousFelicitousness
2246d2c7c4 Add ZImageImg2ImgPipeline (#12751)
* Add ZImageImg2ImgPipeline

Updated the pipeline structure to include ZImageImg2ImgPipeline
    alongside ZImagePipeline.
Implemented the ZImageImg2ImgPipeline class for image-to-image
    transformations, including necessary methods for
    encoding prompts, preparing latents, and denoising.
Enhanced the auto_pipeline to map the new ZImageImg2ImgPipeline
    for image generation tasks.
Added unit tests for ZImageImg2ImgPipeline to ensure
    functionality and performance.
Updated dummy objects to include ZImageImg2ImgPipeline for
    testing purposes.

* Address review comments for ZImageImg2ImgPipeline

- Add `# Copied from` annotations to encode_prompt and _encode_prompt
- Add ZImagePipeline to auto_pipeline.py for AutoPipeline support

* Add ZImage pipeline documentation

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
2025-12-07 22:06:23 -10:00
YiYi Xu
671149e036 [HunyuanVideo1.5] support step-distilled (#12802)
* support step-distilled

* style
2025-12-07 21:50:36 -10:00
jiqing-feng
f67639b0bb add post init for safty checker (#12794)
* add post init for safty checker

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* check transformers version before post init

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* Apply style fixes

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-12-08 11:31:03 +05:30
jingyu-ml
5a74319715 Update the TensorRT-ModelOPT to Nvidia-ModelOPT (#12793)
Update the naming

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-12-08 10:07:04 +05:30
Tran Thanh Luan
6290fdfda4 [Feat] TaylorSeer Cache (#12648)
* init taylor_seer cache

* make compatible with any tuple size returned

* use logger for printing, add warmup feature

* still update in warmup steps

* refractor, add docs

* add configurable cache, skip compute module

* allow special cache ids only

* add stop_predicts (cooldown)

* update docs

* apply ruff

* update to handle multple calls per timestep

* refractor to use state manager

* fix format & doc

* chores: naming, remove redundancy

* add docs

* quality & style

* fix taylor precision

* Apply style fixes

* add tests

* Apply style fixes

* Remove TaylorSeerCacheTesterMixin from flux2 tests

* rename identifiers, use more expressive taylor predict loop

* torch compile compatible

* Apply style fixes

* Update src/diffusers/hooks/taylorseer_cache.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* update docs

* make fix-copies

* fix example usage.

* remove tests on flux kontext

---------

Co-authored-by: toilaluan <toilaluan@github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-12-06 05:39:54 +05:30
David El Malih
256e010674 Improve docstrings and type hints in scheduling_deis_multistep.py (#12796)
* feat: Add `flow_prediction` to `prediction_type`, introduce `use_flow_sigmas`, `flow_shift`, `use_dynamic_shifting`, and `time_shift_type` parameters, and refine type hints for various arguments.

* style: reformat argument wrapping in `_convert_to_beta` and `index_for_timestep` method signatures.
2025-12-05 08:48:01 -08:00
Sayak Paul
8430ac2a2f [docs] minor fixes to kandinsky docs (#12797)
up
2025-12-05 08:33:05 -08:00
sayakpaul
bb9e713d02 move kandisnky docs. 2025-12-05 21:44:24 +07:00
Álvaro Somoza
c98c157a9e [Docs] Add Z-Image docs (#12775)
* initial

* toctree

* fix

* apply review and fix

* Update docs/source/en/api/pipelines/z_image.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/z_image.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/z_image.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-12-05 11:05:47 -03:00
swappy
f12d161d67 Fix broken group offloading with block_level for models with standalone layers (#12692)
* fix: group offloading to support standalone computational layers in block-level offloading

* test: for models with standalone and deeply nested layers in block-level offloading

* feat: support for block-level offloading in group offloading config

* fix: group offload block modules to AutoencoderKL and AutoencoderKLWan

* fix: update group offloading tests to use AutoencoderKL and adjust input dimensions

* refactor: streamline block offloading logic

* Apply style fixes

* update tests

* update

* fix for failing tests

* clean up

* revert to use skip_keys

* clean up

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-12-05 18:54:05 +05:30
David Bertoin
8d415a6f48 PRX Set downscale_freq_shift to 0 for consistency with internal implementation (#12791)
fix timestepembeddings downscale_freq_shift to be consitant with Photoroom's original code
2025-12-04 10:57:14 -10:00
Sayak Paul
7de51b826c [lora] support more ZImage LoRAs (#12790)
up

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
2025-12-04 09:01:11 -10:00
Jiang
cd00ba685b fix spatial compression ratio error for AutoEncoderKLWan doing tiled encode (#12753)
fix spatial compression ratio compute error for AutoEncoderKLWan

Co-authored-by: lirui.926 <lirui.926@bytedance.com>
2025-12-04 08:57:13 -10:00
David El Malih
2842c14c5f Improve docstrings and type hints in scheduling_unipc_multistep.py (#12767)
refactor: add type hints and update docstrings for UniPCMultistepScheduler parameters and methods.
2025-12-04 10:10:54 -08:00
Sayak Paul
c318686090 Update attention_backends.md to format kernels (#12757) 2025-12-04 07:48:23 -08:00
hlky
6028613226 Z-Image-Turbo from_single_file (#12756)
* Z-Image-Turbo `from_single_file`

* compute_dtype

* -device cast
2025-12-04 20:22:48 +05:30
Sayak Paul
a1f36ee3ef [Z-Image] various small changes, Z-Image transformer tests, etc. (#12741)
* start zimage model tests.

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* Revert "up"

This reverts commit bca3e27c96.

* expand upon compilation failure reason.

* Update tests/models/transformers/test_models_transformer_z_image.py

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

* reinitialize the padding tokens to ones to prevent NaN problems.

* updates

* up

* skipping ZImage DiT tests

* up

* up

---------

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
2025-12-03 19:35:46 +05:30
Sayak Paul
d96cbacacd [tests] fix hunuyanvideo 1.5 offloading tests. (#12782)
fix hunuyanvideo 1.5 offloading tests.
2025-12-03 18:07:59 +05:30
Aditya Borate
5ab5946931 Fix: leaf_level offloading breaks after delete_adapters (#12639)
* Fix(peft): Re-apply group offloading after deleting adapters

* Test: Add regression test for group offloading + delete_adapters

* Test: Add assertions to verify output changes after deletion

* Test: Add try/finally to clean up group offloading hooks

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-12-03 17:39:11 +05:30
Lev Novitskiy
d0c54e5563 Kandinsky 5.0 Video Pro and Image Lite (#12664)
* add transformer pipeline first version


---------

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Charles <charles@huggingface.co>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: dmitrienkoae <dmitrienko.ae@phystech.edu>
Co-authored-by: nvvaulin <nvvaulin@gmail.com>
2025-12-03 00:46:37 -10:00
Dhruv Nair
1908c47600 Deprecate upcast_vae in SDXL based pipelines (#12619)
* update

* update

* Revert "update"

This reverts commit 73906381ab.

* Revert "update"

This reverts commit 21a03f93ef.

* update

* update

* update

* update

* update
2025-12-03 15:53:23 +05:30
Sayak Paul
759ea58708 [core] reuse AttentionMixin for compatible classes (#12463)
* remove attn_processors property

* more

* up

* up more.

* up

* add AttentionMixin to AuraFlow.

* up

* up

* up

* up
2025-12-03 13:58:33 +05:30
Sayak Paul
f48f9c250f [core] start varlen variants for attn backend kernels. (#12765)
* start varlen variants for attn backend kernels.

* maybe unflatten heads.

* updates

* remove unused function.

* doc

* up
2025-12-03 13:34:52 +05:30
Kimbing Ng
3c05b9f71c Fixes #12673. record_stream in group offloading is not working properly (#12721)
* Fixes #12673.

    Wrong default_stream is used. leading to wrong execution order when record_steram is enabled.

* update

* Update test

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-12-03 11:37:11 +05:30
Jerry Wu
9379b2391b Fix TPU (torch_xla) compatibility Error about tensor repeat func along with empty dim. (#12770)
* Refactor image padding logic to pervent zero tensor in transformer_z_image.py

* Apply style fixes

* Add more support to fix repeat bug on tpu devices.

* Fix for dynamo compile error for multi if-branches.

---------

Co-authored-by: Mingjia Li <mingjiali@tju.edu.cn>
Co-authored-by: Mingjia Li <mail@mingjia.li>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-12-02 12:51:23 -10:00