1
0
mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00
Commit Graph

6150 Commits

Author SHA1 Message Date
Sayak Paul
537d4de2cc Update distributed_inference.md to reposition sections 2026-01-13 20:38:37 +05:30
Bissmella Bahaduri
9d68742214 Add Unified Sequence Parallel attention (#12693)
* initial scheme of unified-sp

* initial all_to_all_double

* bug fixes, added cmnts

* unified attention prototype done

* remove raising value error in contextParallelConfig to enable unified attention

* bug fix

* feat: Adds Test for Unified SP Attention and Fixes a bug in Template Ring Attention

* bug fix, lse calculation, testing

bug fixes, lse calculation

-

switched to _all_to_all_single helper in _all_to_all_dim_exchange due contiguity issues

bug fix

bug fix

bug fix

* addressing comments

* sequence parallelsim bug fixes

* code format fixes

* Apply style fixes

* code formatting fix

* added unified attention docs and removed test file

* Apply style fixes

* tip for unified attention in docs at distributed_inference.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update distributed_inference.md, adding benchmarks

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/training/distributed_inference.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* function name fix

* fixed benchmark in docs

---------

Co-authored-by: KarthikSundar2002 <karthiksundar30092002@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2026-01-13 09:16:51 +05:30
dg845
f1a93c765f Add Flag to PeftLoraLoaderMixinTests to Enable/Disable Text Encoder LoRA Tests (#12962)
* Improve incorrect LoRA format error message

* Add flag in PeftLoraLoaderMixinTests to disable text encoder LoRA tests

* Apply changes to LTX2LoraTests

* Further improve incorrect LoRA format error msg following review

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2026-01-12 16:01:58 -08:00
Leo Jiang
29a930a142 Bugfix for flux2 img2img2 prediction (#12855)
* Bugfix for dreambooth flux2 img2img2

* Bugfix for dreambooth flux2 img2img2

* Bugfix for dreambooth flux2 img2img2

* Bugfix for dreambooth flux2 img2img2

* Bugfix for dreambooth flux2 img2img2

* Bugfix for dreambooth flux2 img2img2

Co-authored-by: tcaimm <93749364+tcaimm@users.noreply.github.com>

---------

Co-authored-by: tcaimm <93749364+tcaimm@users.noreply.github.com>
2026-01-12 20:07:02 +05:30
Kashif Rasul
dad5cb55e6 Fix QwenImage txt_seq_lens handling (#12702)
* Fix QwenImage txt_seq_lens handling

* formatting

* formatting

* remove txt_seq_lens and use bool  mask

* use compute_text_seq_len_from_mask

* add seq_lens to dispatch_attention_fn

* use joint_seq_lens

* remove unused index_block

* WIP: Remove seq_lens parameter and use mask-based approach

- Remove seq_lens parameter from dispatch_attention_fn
- Update varlen backends to extract seqlens from masks
- Update QwenImage to pass 2D joint_attention_mask
- Fix native backend to handle 2D boolean masks
- Fix sage_varlen seqlens_q to match seqlens_k for self-attention

Note: sage_varlen still producing black images, needs further investigation

* fix formatting

* undo sage changes

* xformers support

* hub fix

* fix torch compile issues

* fix tests

* use _prepare_attn_mask_native

* proper deprecation notice

* add deprecate to txt_seq_lens

* Update src/diffusers/models/transformers/transformer_qwenimage.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/transformers/transformer_qwenimage.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Only create the mask if there's actual padding

* fix order of docstrings

* Adds performance benchmarks and optimization details for QwenImage

Enhances documentation with comprehensive performance insights for QwenImage pipeline:

* rope_text_seq_len = text_seq_len

* rename to max_txt_seq_len

* removed deprecated args

* undo unrelated change

* Updates QwenImage performance documentation

Removes detailed attention backend benchmarks and simplifies torch.compile performance description

Focuses on key performance improvement with torch.compile, highlighting the specific speedup from 4.70s to 1.93s on an A100 GPU

Streamlines the documentation to provide more concise and actionable performance insights

* Updates deprecation warnings for txt_seq_lens parameter

Extends deprecation timeline for txt_seq_lens from version 0.37.0 to 0.39.0 across multiple Qwen image-related models

Adds a new unit test to verify the deprecation warning behavior for the txt_seq_lens parameter

* fix compile

* formatting

* fix compile tests

* rename helper

* remove duplicate

* smaller values

* removed

* use torch.cond for torch compile

* Construct joint attention mask once

* test different backends

* construct joint attention mask once to avoid reconstructing in every block

* Update src/diffusers/models/attention_dispatch.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* formatting

* raising an error from the EditPlus pipeline when batch_size > 1

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: cdutr <dutra_carlos@hotmail.com>
2026-01-12 13:45:09 +05:30
Francisco Kurucz
b86bd99eac Fix link to diffedit implementation reference (#12708) 2026-01-10 11:13:23 -08:00
omahs
5b202111bf Fix typos (#12705) 2026-01-10 11:11:15 -08:00
Sayak Paul
4ac2b4a521 [docs] polish caching docs. (#12684)
* polish caching docs.

* Update docs/source/en/optimization/cache.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/cache.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* up

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2026-01-10 10:09:05 -08:00
YiYi Xu
418313bbf6 [Modular] better docstring (#12932)
add output to auto blocks + core denoising block for better doc string
2026-01-09 23:53:56 -10:00
Rafael Tvelov
2120c3096f Fix: typo in autoencoder_dc.py (#12687)
Fix typo in autoencoder_dc.py

Fixing typo in `get_block` function's parameter name:
"qkv_mutliscales" -> "qkv_multiscales"

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2026-01-09 22:01:54 -10:00
Sayak Paul
ed6e5ecf67 [LoRA] add LoRA support to LTX-2 (#12933)
* up

* fixes

* tests

* docs.

* fix

* change loading info.

* up

* up
2026-01-10 11:27:22 +05:30
Sayak Paul
d44b5f86e6 fix how is_fsdp is determined (#12960)
up
2026-01-10 10:34:25 +05:30
Jay Wu
02c7adc356 [ChronoEdit] support multiple loras (#12679)
Co-authored-by: wjay <wjay@nvidia.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2026-01-09 15:50:16 -10:00
Sayak Paul
a3cc0e7a52 [modular] error early in enable_auto_cpu_offload (#12578)
error early in auto_cpu_offload
2026-01-09 15:30:52 -10:00
Daniel Socek
2a6cdc0b3e Fix ftfy name error in Wan pipeline (#12314)
Signed-off-by: Daniel Socek <daniel.socek@intel.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2026-01-09 14:02:40 -10:00
SahilCarterr
1791306739 [Fix] syntax in QwenImageEditPlusPipeline (#12371)
* Fixes syntax for consistency among pipelines

* Update test_qwenimage_edit_plus.py
2026-01-09 13:55:42 -10:00
Samu Tamminen
df6516a716 Align HunyuanVideoConditionEmbedding with CombinedTimestepGuidanceTextProjEmbeddings (#12316)
conditioning additions inline with CombinedTimestepGuidanceTextProjEmbeddings

Co-authored-by: Samu Tamminen <samutamm@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2026-01-09 13:51:04 -10:00
Steven Liu
5794ffffbe [docs] Remote inference (#12372)
* init

* fix
2026-01-09 13:32:14 -10:00
Titong Jiang
4fb44bdf91 Fix wrong param types, docs, and handles noise=None in scale_noise of FlowMatching schedulers (#11669)
* Bug: Fix wrong params, docs, and handles noise=None

* make noise a required arg

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2026-01-09 11:42:33 -10:00
Linoy Tsaban
b7a81582ae [LoRA] add lora_alpha to sana README (#11780)
add lora alpha to readme
2026-01-09 11:28:39 -10:00
Bhavya Bahl
4b64b5603f Change timestep device to cpu for xla (#11501)
* Change timestep device to cpu for xla

* Add all pipelines

* ruff format

* Apply style fixes

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-01-09 11:22:51 -10:00
Kashif Rasul
2bb640f8ea [Research] Latent Perceptual Loss (LPL) for Stable Diffusion XL (#11573)
* initial

* added readme

* fix formatting

* added logging

* formatting

* use config

* debug

* better

* handle SNR

* floats have no item()

* remove debug

* formatting

* add paper link

* acknowledge reference source

* rename script

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2026-01-09 10:24:21 -10:00
Fredy Rivera
2dc9d2af50 Add thread-safe wrappers for components in pipeline (examples/server-async/utils/requestscopedpipeline.py) (#12515)
* Basic implementation of request scheduling

* Basic editing in SD and Flux Pipelines

* Small Fix

* Fix

* Update for more pipelines

* Add examples/server-async

* Add examples/server-async

* Updated RequestScopedPipeline to handle a single tokenizer lock to avoid race conditions

* Fix

* Fix _TokenizerLockWrapper

* Fix _TokenizerLockWrapper

* Delete _TokenizerLockWrapper

* Fix tokenizer

* Update examples/server-async

* Fix server-async

* Optimizations in examples/server-async

* We keep the implementation simple in examples/server-async

* Update examples/server-async/README.md

* Update examples/server-async/README.md for changes to tokenizer locks and backward-compatible retrieve_timesteps

* The changes to the diffusers core have been undone and all logic is being moved to exmaples/server-async

* Update examples/server-async/utils/*

* Fix BaseAsyncScheduler

* Rollback in the core of the diffusers

* Update examples/server-async/README.md

* Complete rollback of diffusers core files

* Simple implementation of an asynchronous server compatible with SD3-3.5 and Flux Pipelines

* Update examples/server-async/README.md

* Fixed import errors in 'examples/server-async/serverasync.py'

* Flux Pipeline Discard

* Update examples/server-async/README.md

* Apply style fixes

* Add thread-safe wrappers for components in pipeline

Refactor requestscopedpipeline.py to add thread-safe wrappers for tokenizer, VAE, and image processor. Introduce locking mechanisms to ensure thread safety during concurrent access.

* Add wrappers.py

* Apply style fixes

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-01-09 09:43:14 -10:00
Teriks
57e57cfae0 Store vae.config.scaling_factor to prevent missing attr reference (sdxl advanced dreambooth training script) (#12346)
Store vae.config.scaling_factor to prevent missing attr reference

In sdxl advanced dreambooth training script

vae.config.scaling_factor becomes inaccessible after: del vae

when: --cache_latents, and no --validation_prompt

Co-authored-by: Teriks <Teriks@users.noreply.github.com>
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2026-01-09 09:42:30 -10:00
gapatron
644169433f Laplace Scheduler for DDPM (#11320)
* Add Laplace scheduler that samples more around mid-range noise levels (around log SNR=0), increasing performance (lower FID) with faster convergence speed, and robust to resolution and objective. Reference:  https://arxiv.org/pdf/2407.03297.

* Fix copies.

* Apply style fixes

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-01-09 09:16:02 -10:00
Ishan Modi
632765a5ee [Feature] MultiControlNet support for SD3Impainting (#11251)
* update

* update

* addressed PR comments

* update

* Apply suggestions from code review

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2026-01-09 08:55:16 -10:00
David El Malih
d36564f06a Improve docstrings and type hints in scheduling_consistency_models.py (#12931)
docs: improve docstring scheduling_consistency_models.py
2026-01-09 09:56:56 -08:00
Sayak Paul
441b69eabf [core] Handle progress bar and logging in distributed environments (#12806)
* disable progressbar in distributed.

* up

* up

* up

* up

* up

* up
2026-01-09 22:23:13 +05:30
Sayak Paul
d568c9773f [chore] remove controlnet implementations outside controlnet module. (#12152)
* remove controlnet implementations outside controlnet module.

* fix

* fix

* fix
2026-01-09 21:22:45 +05:30
Sayak Paul
3981c955ce [modular] Tests for custom blocks in modular diffusers (#12557)
* start custom block testing.

* simplify modular workflow ci.

* up

* style.

* up

* up

* up

* up

* up

* up

* Apply suggestions from code review

* up
2026-01-09 15:57:23 +05:30
YiYi Xu
1903383e94 [Modular] qwen refactor (#12872)
* 3 files

* add conditoinal pipeline

* refactor qwen modular

* add layered

* up up

* u p

* add to import

* more refacotr, make layer work

* clean up a bit git add src

* more

* style

* style
2026-01-08 23:38:49 -10:00
Leo Jiang
08f8b7af9a Bugfix for dreambooth flux2 img2img2 (#12825)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2026-01-09 12:34:44 +05:30
Howard Zhang
2f66edc880 Torchao floatx version guard (#12923)
* Adding torchao version guard for floatx usage

Summary: TorchAO removing floatx support, added version guard in quantization_config.py

* Adding torchao version guard for floatx usage

Summary: TorchAO removing floatx support, added version guard in quantization_config.py
Altered tests in test_torchao.py to version guard floatx
Created new test to verify version guard of floatx support

* Adding torchao version guard for floatx usage

Summary: TorchAO removing floatx support, added version guard in quantization_config.py
Altered tests in test_torchao.py to version guard floatx
Created new test to verify version guard of floatx support

* Adding torchao version guard for floatx usage

Summary: TorchAO removing floatx support, added version guard in quantization_config.py
Altered tests in test_torchao.py to version guard floatx
Created new test to verify version guard of floatx support

* Adding torchao version guard for floatx usage

Summary: TorchAO removing floatx support, added version guard in quantization_config.py
Altered tests in test_torchao.py to version guard floatx
Created new test to verify version guard of floatx support

* Adding torchao version guard for floatx usage

Summary: TorchAO removing floatx support, added version guard in quantization_config.py
Altered tests in test_torchao.py to version guard floatx
Created new test to verify version guard of floatx support

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2026-01-09 10:51:53 +05:30
TmacAaron
be38f41f9f [NPU] npu attention enable ulysses (#12919)
* npu attention enable ulysses

* clean the format

* register _native_npu_attention to _supports_context_parallel

Signed-off-by: yyt <yangyit139@gmail.com>

* change npu_fusion_attention's input_layout to BSND to eliminate redundant transpose

Signed-off-by: yyt <yangyit139@gmail.com>

* Update format

---------

Signed-off-by: yyt <yangyit139@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2026-01-09 10:11:49 +05:30
MSD
91e5134175 fix the warning torch_dtype is deprecated (#12841)
* fix the warning torch_dtype is deprecated

* Add transformers version check (>= 4.56.0) for dtype parameter

* Fix linting errors
2026-01-09 08:35:26 +05:30
Salman Chishti
a812c87465 Upgrade GitHub Actions for Node 24 compatibility (#12865)
Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>
2026-01-09 08:28:58 +05:30
Aditya Borate
8b9f817ef5 Fix: Remove hardcoded CUDA autocast in Kandinsky 5 to fix import warning (#12814)
* Fix: Remove hardcoded CUDA autocast in Kandinsky 5 to fix import warning

* Apply style fixes

* Fix: Remove import-time autocast in Kandinsky to prevent warnings

- Removed @torch.autocast decorator from Kandinsky classes.
- Implemented manual F.linear casting to ensure numerical parity with FP32.
- Verified bit-exact output matches main branch.

Co-authored-by: hlky <hlky@hlky.ac>

* Used _keep_in_fp32_modules to align with standards

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>
2026-01-08 09:00:58 -10:00
David El Malih
b1f06b780a Improve docstrings and type hints in scheduling_consistency_decoder.py (#12928)
docs: improve docstring scheduling_consistency_decoder.py
2026-01-08 09:45:38 -08:00
Pauline Bailly-Masson
8600b4c10d Add environment variables to checkout step (#12927) 2026-01-08 13:38:06 +05:30
dg845
c10bdd9b73 Add LTX 2.0 Video Pipelines (#12915)
* Initial LTX 2.0 transformer implementation

* Add tests for LTX 2 transformer model

* Get LTX 2 transformer tests working

* Rename LTX 2 compile test class to have LTX2

* Remove RoPE debug print statements

* Get LTX 2 transformer compile tests passing

* Fix LTX 2 transformer shape errors

* Initial script to convert LTX 2 transformer to diffusers

* Add more LTX 2 transformer audio arguments

* Allow LTX 2 transformer to be loaded from local path for conversion

* Improve dummy inputs and add test for LTX 2 transformer consistency

* Fix LTX 2 transformer bugs so consistency test passes

* Initial implementation of LTX 2.0 video VAE

* Explicitly specify temporal and spatial VAE scale factors when converting

* Add initial LTX 2.0 video VAE tests

* Add initial LTX 2.0 video VAE tests (part 2)

* Get diffusers implementation on par with official LTX 2.0 video VAE implementation

* Initial LTX 2.0 vocoder implementation

* Use RMSNorm implementation closer to original for LTX 2.0 video VAE

* start audio decoder.

* init registration.

* up

* simplify and clean up

* up

* Initial LTX 2.0 text encoder implementation

* Rough initial LTX 2.0 pipeline implementation

* up

* up

* up

* up

* Add imports for LTX 2.0 Audio VAE

* Conversion script for LTX 2.0 Audio VAE Decoder

* Add Audio VAE logic to T2V pipeline

* Duplicate scheduler for audio latents

* Support num_videos_per_prompt for prompt embeddings

* LTX 2.0 scheduler and full pipeline conversion

* Add script to test full LTX2Pipeline T2V inference

* Fix pipeline return bugs

* Add LTX 2 text encoder and vocoder to ltx2 subdirectory __init__

* Fix more bugs in LTX2Pipeline.__call__

* Improve CPU offload support

* Fix pipeline audio VAE decoding dtype bug

* Fix video shape error in full pipeline test script

* Get LTX 2 T2V pipeline to produce reasonable outputs

* Make LTX 2.0 scheduler more consistent with original code

* Fix typo when applying scheduler fix in T2V inference script

* Refactor Audio VAE to be simpler and remove helpers (#7)

* remove resolve causality axes stuff.

* remove a bunch of helpers.

* remove adjust output shape helper.

* remove the use of audiolatentshape.

* move normalization and patchify out of pipeline.

* fix

* up

* up

* Remove unpatchify and patchify ops before audio latents denormalization (#9)

---------

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

* Add support for I2V (#8)

* start i2v.

* up

* up

* up

* up

* up

* remove uniform strategy code.

* remove unneeded code.

* Denormalize audio latents in I2V pipeline (analogous to T2V change) (#11)

* test i2v.

* Move Video and Audio Text Encoder Connectors to Transformer (#12)

* Denormalize audio latents in I2V pipeline (analogous to T2V change)

* Initial refactor to put video and audio text encoder connectors in transformer

* Get LTX 2 transformer tests working after connector refactor

* precompute run_connectors,.

* fixes

* Address review comments

* Calculate RoPE double precisions freqs using torch instead of np

* Further simplify LTX 2 RoPE freq calc

* Make connectors a separate module (#18)

* remove text_encoder.py

* address yiyi's comments.

* up

* up

* up

* up

---------

Co-authored-by: sayakpaul <spsayakpaul@gmail.com>

* up (#19)

* address initial feedback from lightricks team (#16)

* cross_attn_timestep_scale_multiplier to 1000

* implement split rope type.

* up

* propagate rope_type to rope embed classes as well.

* up

* When using split RoPE, make sure that the output dtype is same as input dtype

* Fix apply split RoPE shape error when reshaping x to 4D

* Add export_utils file for exporting LTX 2.0 videos with audio

* Tests for T2V and I2V (#6)

* add ltx2 pipeline tests.

* up

* up

* up

* up

* remove content

* style

* Denormalize audio latents in I2V pipeline (analogous to T2V change)

* Initial refactor to put video and audio text encoder connectors in transformer

* Get LTX 2 transformer tests working after connector refactor

* up

* up

* i2v tests.

* up

* Address review comments

* Calculate RoPE double precisions freqs using torch instead of np

* Further simplify LTX 2 RoPE freq calc

* revert unneded changes.

* up

* up

* update to split style rope.

* up

---------

Co-authored-by: Daniel Gu <dgu8957@gmail.com>

* up

* use export util funcs.

* Point original checkpoint to LTX 2.0 official checkpoint

* Allow the I2V pipeline to accept image URLs

* make style and make quality

* remove function map.

* remove args.

* update docs.

* update doc entries.

* disable ltx2_consistency test

* Simplify LTX 2 RoPE forward by removing coords is None logic

* make style and make quality

* Support LTX 2.0 audio VAE encoder

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Remove print statement in audio VAE

* up

* Fix bug when calculating audio RoPE coords

* Ltx 2 latent upsample pipeline (#12922)

* Initial implementation of LTX 2.0 latent upsampling pipeline

* Add new LTX 2.0 spatial latent upsampler logic

* Add test script for LTX 2.0 latent upsampling

* Add option to enable VAE tiling in upsampling test script

* Get latent upsampler working with video latents

* Fix typo in BlurDownsample

* Add latent upsample pipeline docstring and example

* Remove deprecated pipeline VAE slicing/tiling methods

* make style and make quality

* When returning latents, return unpacked and denormalized latents for T2V and I2V

* Add model_cpu_offload_seq for latent upsampling pipeline

---------

Co-authored-by: Daniel Gu <dgu8957@gmail.com>

* Fix latent upsampler filename in LTX 2 conversion script

* Add latent upsample pipeline to LTX 2 docs

* Add dummy objects for LTX 2 latent upsample pipeline

* Set default FPS to official LTX 2 ckpt default of 24.0

* Set default CFG scale to official LTX 2 ckpt default of 4.0

* Update LTX 2 pipeline example docstrings

* make style and make quality

* Remove LTX 2 test scripts

* Fix LTX 2 upsample pipeline example docstring

* Add logic to convert and save a LTX 2 upsampling pipeline

* Document LTX2VideoTransformer3DModel forward pass

---------

Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
2026-01-07 21:24:27 -08:00
Álvaro Somoza
dab000e88b [Modular] Video for Mellon (#12924)
num_frames and videos
2026-01-07 12:35:59 -10:00
David El Malih
9fb6b89d49 Improve docstrings and type hints in scheduling_edm_euler.py (#12871)
* docs: add comprehensive docstrings and refine type hints for EDM scheduler methods and config parameters.

* refactor: Add type hints to DPM-Solver scheduler methods.
2026-01-07 11:18:00 -08:00
Sayak Paul
6fb4c99f5a Update wan.md to remove unneeded hfoptions (#12890) 2026-01-07 09:47:19 -08:00
Sayak Paul
961b9b27d3 [docs] fix torchao typo. (#12883)
fix torchao typo.
2026-01-07 09:43:02 -08:00
Tolga Cangöz
8f30bfff1f Add transformer cache context for SkyReels-V2 pipelines & Update docs (#12837)
* feat: Add transformer cache context for conditional and unconditional predictions for skyreels-v2 pipes.

* docs: Remove SkyReels-V2 FLF2V model link and add contributor attribution.
2026-01-06 22:30:30 -10:00
Leo Jiang
b4be29bda2 Add FSDP option for Flux2 (#12860)
* Add FSDP option for Flux2

* Apply style fixes

* Add FSDP option for Flux2

* Add FSDP option for Flux2

* Add FSDP option for Flux2

* Add FSDP option for Flux2

* Add FSDP option for Flux2

* Update examples/dreambooth/README_flux2.md

* guard accelerate import.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-01-07 13:17:46 +05:30
Hu Yaoqi
98479a94c2 LTX Video 0.9.8 long multi prompt (#12614)
* LTX Video 0.9.8  long multi prompt

* Further align comfyui

- Added the “LTXEulerAncestralRFScheduler” scheduler, aligned with [sample_euler_ancestral_RF](7d6103325e/comfy/k_diffusion/sampling.py (L234))

- Updated the LTXI2VLongMultiPromptPipeline.from_pretrained() method:
  - Now uses LTXEulerAncestralRFScheduler by default, for better compatibility with the ComfyUI LTXV workflow.

- Changed the default value of cond_strength from 1.0 to 0.5, aligning with ComfyUI’s default.

- Optimized cross-window overlap blending: moved the latent-space guidance injection to before the UNet and after each step, aligned with[KSamplerX0Inpaint]([ComfyUI/comfy/samplers.py at master · comfyanonymous/ComfyUI](https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/samplers.py#L391))

- Adjusted the default value of skip_steps_sigma_threshold to 1.

* align with diffusers contribute rule

* Add new pipelines and update imports

* Enhance LTXI2VLongMultiPromptPipeline with noise rescaling

Refactor LTXI2VLongMultiPromptPipeline to improve documentation and add noise rescaling functionality.

* Clean up comments in scheduling_ltx_euler_ancestral_rf.py

Removed design notes and limitations from the implementation.

* Enhance video generation example with scheduler

Updated LTXI2VLongMultiPromptPipeline example to include LTXEulerAncestralRFScheduler for ComfyUI parity.

* clean up

* style

* copies

* import ltx scheduler

* copies

* fix

* fix more

* up up

* up up up

* up upup

* Apply suggestions from code review

* Update docs/source/en/api/pipelines/ltx_video.md

* Update docs/source/en/api/pipelines/ltx_video.md

---------

Co-authored-by: yiyixuxu <yixu310@gmail.com>
2026-01-06 18:18:04 -10:00
zhangtao0408
ade1059ae2 [Flux.1] improve pos embed for ascend npu by computing on npu (#12897)
* [Flux.1] improve pos embed for ascend npu by setting it back to npu computation.

* [Flux.2] improve pos embed for ascend npu by setting it back to npu computation.

* [LongCat-Image] improve pos embed for ascend npu by setting it back to npu computation.

* [Ovis-Image] improve pos embed for ascend npu by setting it back to npu computation.

* Remove unused import of is_torch_npu_available

---------

Co-authored-by: zhangtao <zhangtao529@huawei.com>
2026-01-06 08:48:04 -10:00
dxqb
41a6e86faf Check for attention mask in backends that don't support it (#12892)
* check attention mask

* Apply style fixes

* bugfix

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2026-01-06 22:52:12 +05:30
Pauline Bailly-Masson
9b5a244653 CodeQL workflow for security analysis 2026-01-06 17:26:08 +01:00