1
0
mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00
Commit Graph

6106 Commits

Author SHA1 Message Date
Daniel Gu
cbb10b8dca Support num_videos_per_prompt for prompt embeddings 2025-12-23 07:01:17 +01:00
Daniel Gu
6e6ce20595 Duplicate scheduler for audio latents 2025-12-23 06:40:35 +01:00
Daniel Gu
54bfc5d617 Add Audio VAE logic to T2V pipeline 2025-12-23 03:51:22 +01:00
Daniel Gu
ae3b6e7cc2 Merge branch 'ltx-2-transformer' into ltx-2-t2v-pipeline 2025-12-23 02:59:33 +01:00
Daniel Gu
d303e2a6ff Conversion script for LTX 2.0 Audio VAE Decoder 2025-12-23 02:48:15 +01:00
Daniel Gu
5f7e43d17f Add imports for LTX 2.0 Audio VAE 2025-12-23 02:08:51 +01:00
dg845
7bb4cf76ce Merge pull request #5 from huggingface/audio-decoder
Audio decoder
2025-12-22 17:00:11 -08:00
sayakpaul
409d651bab resolve conflicts. 2025-12-22 15:59:31 +05:30
sayakpaul
8134da6a56 up 2025-12-22 15:55:29 +05:30
Sayak Paul
059999a3f7 up 2025-12-22 10:24:55 +00:00
sayakpaul
58257eb0e0 up 2025-12-22 15:45:56 +05:30
Sayak Paul
5f0f2a03f7 up 2025-12-22 10:06:39 +00:00
Daniel Gu
d0f9cdaab1 Rough initial LTX 2.0 pipeline implementation 2025-12-22 10:07:20 +01:00
Daniel Gu
0028955c37 Initial LTX 2.0 text encoder implementation 2025-12-22 10:06:01 +01:00
sayakpaul
4904fd6fa5 up 2025-12-22 13:46:58 +05:30
sayakpaul
907896d533 simplify and clean up 2025-12-22 13:41:41 +05:30
sayakpaul
e54cd6bb1d up 2025-12-22 13:03:40 +05:30
sayakpaul
f4c2435d61 init registration. 2025-12-22 12:25:36 +05:30
sayakpaul
b34ddb1736 start audio decoder. 2025-12-22 12:23:31 +05:30
Daniel Gu
6c56954fa8 Use RMSNorm implementation closer to original for LTX 2.0 video VAE 2025-12-20 02:40:38 +01:00
dg845
b1cf6ff8a9 Merge pull request #2 from huggingface/ltx-2-video-vae
LTX 2.0 Video VAE Implementation
2025-12-19 16:36:38 -08:00
dg845
8bfeb4af56 Merge pull request #3 from huggingface/ltx-2-vocoder
LTX 2.0 Vocoder Implementation
2025-12-19 16:21:31 -08:00
Daniel Gu
c6a11a5530 Initial LTX 2.0 vocoder implementation 2025-12-19 12:17:10 +01:00
Daniel Gu
a748975a7c Get diffusers implementation on par with official LTX 2.0 video VAE implementation 2025-12-19 07:02:38 +01:00
Daniel Gu
491aae08d8 Add initial LTX 2.0 video VAE tests (part 2) 2025-12-17 11:39:09 +01:00
Daniel Gu
5b950d6fef Add initial LTX 2.0 video VAE tests 2025-12-17 11:30:15 +01:00
Daniel Gu
baf23e2da3 Explicitly specify temporal and spatial VAE scale factors when converting 2025-12-17 11:14:45 +01:00
Daniel Gu
269cf7b40d Initial implementation of LTX 2.0 video VAE 2025-12-17 10:51:34 +01:00
Daniel Gu
bda3ff13db Fix LTX 2 transformer bugs so consistency test passes 2025-12-16 10:53:43 +01:00
Daniel Gu
a7bc052e89 Improve dummy inputs and add test for LTX 2 transformer consistency 2025-12-16 10:44:02 +01:00
Daniel Gu
57a8b9c330 Allow LTX 2 transformer to be loaded from local path for conversion 2025-12-16 10:38:03 +01:00
Daniel Gu
d86f89ddea Add more LTX 2 transformer audio arguments 2025-12-16 07:58:12 +01:00
Daniel Gu
a5f2d2da6c Initial script to convert LTX 2 transformer to diffusers 2025-12-15 07:09:42 +01:00
Daniel Gu
aeecc4d712 Fix LTX 2 transformer shape errors 2025-12-15 06:38:57 +01:00
Daniel Gu
5765759cd3 Get LTX 2 transformer compile tests passing 2025-12-15 03:38:34 +01:00
Daniel Gu
780fb61d32 Remove RoPE debug print statements 2025-12-13 10:37:24 +01:00
Daniel Gu
e100b8f2a3 Rename LTX 2 compile test class to have LTX2 2025-12-13 10:34:11 +01:00
Daniel Gu
980591de53 Get LTX 2 transformer tests working 2025-12-13 04:57:23 +01:00
Daniel Gu
b3096c3c9e Add tests for LTX 2 transformer model 2025-12-13 04:55:41 +01:00
Daniel Gu
aa602ac483 Initial LTX 2.0 transformer implementation 2025-12-12 07:52:33 +01:00
Sayak Paul
8b4722de57 Fix Qwen Edit Plus modular for multi-image input (#12601)
* try to fix qwen edit plus multi images (modular)

* up

* up

* test

* up

* up
2025-12-09 10:08:30 -10:00
YiYi Xu
07ea0786e8 [Modular]z-image (#12808)
* initiL

* up up

* fix: z_image -> z-image

* style

* copy

* fix more

* some docstring fix
2025-12-09 08:08:41 -10:00
David El Malih
54fa0745c3 Improve docstrings and type hints in scheduling_dpmsolver_singlestep.py (#12798)
feat: add flow sigmas, dynamic shifting, and refine type hints in DPMSolverSinglestepScheduler
2025-12-08 08:58:57 -08:00
David Lacalle Castillo
3d02cd543e [PRX] Improve model compilation (#12787)
* Reimplement img2seq & seq2img in PRX to enable ONNX build without Col2Im (incompatible with TensorRT).

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-12-08 17:42:17 +05:30
CalamitousFelicitousness
2246d2c7c4 Add ZImageImg2ImgPipeline (#12751)
* Add ZImageImg2ImgPipeline

Updated the pipeline structure to include ZImageImg2ImgPipeline
    alongside ZImagePipeline.
Implemented the ZImageImg2ImgPipeline class for image-to-image
    transformations, including necessary methods for
    encoding prompts, preparing latents, and denoising.
Enhanced the auto_pipeline to map the new ZImageImg2ImgPipeline
    for image generation tasks.
Added unit tests for ZImageImg2ImgPipeline to ensure
    functionality and performance.
Updated dummy objects to include ZImageImg2ImgPipeline for
    testing purposes.

* Address review comments for ZImageImg2ImgPipeline

- Add `# Copied from` annotations to encode_prompt and _encode_prompt
- Add ZImagePipeline to auto_pipeline.py for AutoPipeline support

* Add ZImage pipeline documentation

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
2025-12-07 22:06:23 -10:00
YiYi Xu
671149e036 [HunyuanVideo1.5] support step-distilled (#12802)
* support step-distilled

* style
2025-12-07 21:50:36 -10:00
jiqing-feng
f67639b0bb add post init for safty checker (#12794)
* add post init for safty checker

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* check transformers version before post init

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* Apply style fixes

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-12-08 11:31:03 +05:30
jingyu-ml
5a74319715 Update the TensorRT-ModelOPT to Nvidia-ModelOPT (#12793)
Update the naming

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-12-08 10:07:04 +05:30
Tran Thanh Luan
6290fdfda4 [Feat] TaylorSeer Cache (#12648)
* init taylor_seer cache

* make compatible with any tuple size returned

* use logger for printing, add warmup feature

* still update in warmup steps

* refractor, add docs

* add configurable cache, skip compute module

* allow special cache ids only

* add stop_predicts (cooldown)

* update docs

* apply ruff

* update to handle multple calls per timestep

* refractor to use state manager

* fix format & doc

* chores: naming, remove redundancy

* add docs

* quality & style

* fix taylor precision

* Apply style fixes

* add tests

* Apply style fixes

* Remove TaylorSeerCacheTesterMixin from flux2 tests

* rename identifiers, use more expressive taylor predict loop

* torch compile compatible

* Apply style fixes

* Update src/diffusers/hooks/taylorseer_cache.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* update docs

* make fix-copies

* fix example usage.

* remove tests on flux kontext

---------

Co-authored-by: toilaluan <toilaluan@github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-12-06 05:39:54 +05:30
David El Malih
256e010674 Improve docstrings and type hints in scheduling_deis_multistep.py (#12796)
* feat: Add `flow_prediction` to `prediction_type`, introduce `use_flow_sigmas`, `flow_shift`, `use_dynamic_shifting`, and `time_shift_type` parameters, and refine type hints for various arguments.

* style: reformat argument wrapping in `_convert_to_beta` and `index_for_timestep` method signatures.
2025-12-05 08:48:01 -08:00