1
0
mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00
Commit Graph

6187 Commits

Author SHA1 Message Date
sayakpaul
91ee2dd26a resolve conflicts 2026-01-07 10:12:20 +05:30
Hu Yaoqi
98479a94c2 LTX Video 0.9.8 long multi prompt (#12614)
* LTX Video 0.9.8  long multi prompt

* Further align comfyui

- Added the “LTXEulerAncestralRFScheduler” scheduler, aligned with [sample_euler_ancestral_RF](7d6103325e/comfy/k_diffusion/sampling.py (L234))

- Updated the LTXI2VLongMultiPromptPipeline.from_pretrained() method:
  - Now uses LTXEulerAncestralRFScheduler by default, for better compatibility with the ComfyUI LTXV workflow.

- Changed the default value of cond_strength from 1.0 to 0.5, aligning with ComfyUI’s default.

- Optimized cross-window overlap blending: moved the latent-space guidance injection to before the UNet and after each step, aligned with[KSamplerX0Inpaint]([ComfyUI/comfy/samplers.py at master · comfyanonymous/ComfyUI](https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/samplers.py#L391))

- Adjusted the default value of skip_steps_sigma_threshold to 1.

* align with diffusers contribute rule

* Add new pipelines and update imports

* Enhance LTXI2VLongMultiPromptPipeline with noise rescaling

Refactor LTXI2VLongMultiPromptPipeline to improve documentation and add noise rescaling functionality.

* Clean up comments in scheduling_ltx_euler_ancestral_rf.py

Removed design notes and limitations from the implementation.

* Enhance video generation example with scheduler

Updated LTXI2VLongMultiPromptPipeline example to include LTXEulerAncestralRFScheduler for ComfyUI parity.

* clean up

* style

* copies

* import ltx scheduler

* copies

* fix

* fix more

* up up

* up up up

* up upup

* Apply suggestions from code review

* Update docs/source/en/api/pipelines/ltx_video.md

* Update docs/source/en/api/pipelines/ltx_video.md

---------

Co-authored-by: yiyixuxu <yixu310@gmail.com>
2026-01-06 18:18:04 -10:00
Sayak Paul
cc28cf76a7 Merge branch 'main' into ltx-2-transformer 2026-01-07 09:43:08 +05:30
Daniel Gu
d01a242cdb make style and make quality 2026-01-06 23:54:23 +01:00
Daniel Gu
5e0cf2b2f0 Simplify LTX 2 RoPE forward by removing coords is None logic 2026-01-06 23:32:59 +01:00
zhangtao0408
ade1059ae2 [Flux.1] improve pos embed for ascend npu by computing on npu (#12897)
* [Flux.1] improve pos embed for ascend npu by setting it back to npu computation.

* [Flux.2] improve pos embed for ascend npu by setting it back to npu computation.

* [LongCat-Image] improve pos embed for ascend npu by setting it back to npu computation.

* [Ovis-Image] improve pos embed for ascend npu by setting it back to npu computation.

* Remove unused import of is_torch_npu_available

---------

Co-authored-by: zhangtao <zhangtao529@huawei.com>
2026-01-06 08:48:04 -10:00
dxqb
41a6e86faf Check for attention mask in backends that don't support it (#12892)
* check attention mask

* Apply style fixes

* bugfix

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2026-01-06 22:52:12 +05:30
Pauline Bailly-Masson
9b5a244653 CodeQL workflow for security analysis 2026-01-06 17:26:08 +01:00
Pauline Bailly-Masson
417f6b2d33 Delete .github/workflows/codeql.yml 2026-01-06 17:25:38 +01:00
Pauline Bailly-Masson
e46354d2d0 Add codeQL workflow (#12917)
Updated CodeQL workflow to use reusable workflow from Hugging Face and simplified language matrix.
2026-01-06 17:19:48 +01:00
Sayak Paul
64b48c1729 Merge branch 'main' into ltx-2-transformer 2026-01-06 21:31:46 +05:30
sayakpaul
8c5ab1fd6d disable ltx2_consistency test 2026-01-06 21:31:29 +05:30
sayakpaul
61e0fb4bd8 update doc entries. 2026-01-06 21:15:47 +05:30
sayakpaul
bdcf23ec17 update docs. 2026-01-06 21:02:18 +05:30
sayakpaul
c39f1b87a4 remove args. 2026-01-06 20:52:49 +05:30
sayakpaul
57ead0b5e5 remove function map. 2026-01-06 20:48:16 +05:30
Pauline Bailly-Masson
db37140474 Refactor environment variable assignments in workflow (#12916) 2026-01-06 13:39:18 +01:00
Sayak Paul
2fc578941b Merge branch 'main' into ltx-2-transformer 2026-01-06 13:51:36 +05:30
hlky
88ffb00139 Detect 2.0 vs 2.1 ZImageControlNetModel (#12861)
* Detect 2.0 vs 2.1 ZImageControlNetModel

* Possibility of control_noise_refiner being removed
2026-01-05 20:28:52 -10:00
Sayak Paul
b6098ca006 [core] remove unneeded autoencoder methods when subclassing from AutoencoderMixin (#12873)
up
2026-01-05 19:43:54 -10:00
Sayak Paul
7c6d314549 fix the use of device_map in CP docs (#12902)
up
2026-01-05 19:42:32 -10:00
Daniel Gu
dd81242eba make style and make quality 2026-01-06 06:42:24 +01:00
Daniel Gu
ace2ee93fb Allow the I2V pipeline to accept image URLs 2026-01-06 06:40:42 +01:00
Daniel Gu
ef199118e2 Point original checkpoint to LTX 2.0 official checkpoint 2026-01-06 06:35:51 +01:00
sayakpaul
550eca3530 use export util funcs. 2026-01-06 09:14:38 +05:30
sayakpaul
c039c87b99 up 2026-01-06 08:09:59 +05:30
sayakpaul
9b8788cc98 resolve conflicts. 2026-01-06 08:09:37 +05:30
Sayak Paul
93a417f24a Tests for T2V and I2V (#6)
* add ltx2 pipeline tests.

* up

* up

* up

* up

* remove content

* style

* Denormalize audio latents in I2V pipeline (analogous to T2V change)

* Initial refactor to put video and audio text encoder connectors in transformer

* Get LTX 2 transformer tests working after connector refactor

* up

* up

* i2v tests.

* up

* Address review comments

* Calculate RoPE double precisions freqs using torch instead of np

* Further simplify LTX 2 RoPE freq calc

* revert unneded changes.

* up

* up

* update to split style rope.

* up

---------

Co-authored-by: Daniel Gu <dgu8957@gmail.com>
2026-01-06 08:05:30 +05:30
dg845
ce9da5d472 Merge pull request #20 from huggingface/video-export-utils-file
Add export_utils file for exporting LTX 2.0 videos with audio
2026-01-05 18:25:29 -08:00
DefTruth
3138e37fe6 Fix wan 2.1 i2v context parallel (#12909)
* fix wan 2.1 i2v context parallel

* fix wan 2.1 i2v context parallel

* fix wan 2.1 i2v context parallel

* format
2026-01-06 07:42:53 +05:30
Miguel Martin
0da1aa90b5 Fix typo in src/diffusers/pipelines/cosmos/pipeline_cosmos2_5_predict.py (#12914) 2026-01-05 15:44:39 -10:00
Daniel Gu
cb50cacba5 Add export_utils file for exporting LTX 2.0 videos with audio 2026-01-06 02:17:39 +01:00
Daniel Gu
bff989110c Fix apply split RoPE shape error when reshaping x to 4D 2026-01-06 01:22:05 +01:00
Daniel Gu
2fa4f8471f When using split RoPE, make sure that the output dtype is same as input dtype 2026-01-06 00:19:39 +01:00
Sayak Paul
c5b52d6c9f address initial feedback from lightricks team (#16)
* cross_attn_timestep_scale_multiplier to 1000

* implement split rope type.

* up

* propagate rope_type to rope embed classes as well.

* up
2026-01-05 21:13:10 +05:30
Sayak Paul
0be4f31620 up (#19) 2026-01-05 21:13:01 +05:30
dg845
caae16768a Move Video and Audio Text Encoder Connectors to Transformer (#12)
* Denormalize audio latents in I2V pipeline (analogous to T2V change)

* Initial refactor to put video and audio text encoder connectors in transformer

* Get LTX 2 transformer tests working after connector refactor

* precompute run_connectors,.

* fixes

* Address review comments

* Calculate RoPE double precisions freqs using torch instead of np

* Further simplify LTX 2 RoPE freq calc

* Make connectors a separate module (#18)

* remove text_encoder.py

* address yiyi's comments.

* up

* up

* up

* up

---------

Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
2026-01-05 20:11:13 +05:30
Jefri Haryono
5ffb65803d Community Pipeline: Add z-image differential img2img (#12882)
* Community Pipeline: Add z-image differential img2img

* add pipeline for z-image differential img2img diffusion examples : run make style , make quality, and fix white spaces in example doc string.

---------

Co-authored-by: r4inm4ker <jefri.yeh@gmail.com>
2026-01-05 09:53:52 -03:00
DefTruth
d0ae34d313 chore: fix dev version in setup.py (#12904) 2026-01-05 09:21:48 +05:30
hlky
47378066c0 Z-Image-Turbo from_single_file fix (#12888) 2026-01-02 22:29:24 +05:30
Maxim Balabanski
208cda8f6d fix Qwen Image Transformer single file loading mapping function to be consistent with other loader APIs (#12894)
fix Qwen single file loading to be consistent with other loader API
2026-01-02 12:59:11 +05:30
dg845
aae70b90db Merge pull request #10 from huggingface/make-scheduler-consistent
Make LTX 2.0 Scheduler `sigmas` Consistent with Original Code
2025-12-31 13:46:47 -08:00
sayakpaul
d3f10fe54e test i2v. 2025-12-31 09:36:48 +05:30
dg845
bd607b97a8 Denormalize audio latents in I2V pipeline (analogous to T2V change) (#11) 2025-12-31 09:23:35 +05:30
Daniel Gu
6a236a27fb Merge branch 'ltx-2-transformer' into make-scheduler-consistent 2025-12-30 20:25:59 +01:00
Vasiliy Kuznetsov
1cdb8723b8 fix torchao quantizer for new torchao versions (#12901)
* fix torchao quantizer for new torchao versions

Summary:

`torchao==0.16.0` (not yet released) has some bc-breaking changes, this
PR fixes the diffusers repo with those changes. Specifics on the
changes:
1. `UInt4Tensor` is removed: https://github.com/pytorch/ao/pull/3536
2. old float8 tensors v1 are removed: https://github.com/pytorch/ao/pull/3510

In this PR:
1. move the logger variable up (not sure why it was in the middle of the
   file before) to get better error messages
2. gate the old torchao objects by torchao version

Test Plan:

import diffusers objects with new versions of torchao works:

```bash
> python -c "import torchao; print(torchao.__version__); from diffusers import StableDiffusionPipeline"
0.16.0.dev20251229+cu129
```

Reviewers:

Subscribers:

Tasks:

Tags:

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-12-30 10:04:54 +05:30
Sayak Paul
46822c43db Add support for I2V (#8)
* start i2v.

* up

* up

* up

* up

* up

* remove uniform strategy code.

* remove unneeded code.
2025-12-30 09:06:07 +05:30
Sayak Paul
280e347814 Refactor Audio VAE to be simpler and remove helpers (#7)
* remove resolve causality axes stuff.

* remove a bunch of helpers.

* remove adjust output shape helper.

* remove the use of audiolatentshape.

* move normalization and patchify out of pipeline.

* fix

* up

* up

* Remove unpatchify and patchify ops before audio latents denormalization (#9)

---------

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
2025-12-30 08:05:56 +05:30
Daniel Gu
e1f0b7e255 Fix typo when applying scheduler fix in T2V inference script 2025-12-30 00:38:51 +01:00
Daniel Gu
581f21c431 Make LTX 2.0 scheduler more consistent with original code 2025-12-29 23:44:52 +01:00