1
0
mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00
Commit Graph

6128 Commits

Author SHA1 Message Date
Daniel Gu
2fa4f8471f When using split RoPE, make sure that the output dtype is same as input dtype 2026-01-06 00:19:39 +01:00
Sayak Paul
c5b52d6c9f address initial feedback from lightricks team (#16)
* cross_attn_timestep_scale_multiplier to 1000

* implement split rope type.

* up

* propagate rope_type to rope embed classes as well.

* up
2026-01-05 21:13:10 +05:30
Sayak Paul
0be4f31620 up (#19) 2026-01-05 21:13:01 +05:30
dg845
caae16768a Move Video and Audio Text Encoder Connectors to Transformer (#12)
* Denormalize audio latents in I2V pipeline (analogous to T2V change)

* Initial refactor to put video and audio text encoder connectors in transformer

* Get LTX 2 transformer tests working after connector refactor

* precompute run_connectors,.

* fixes

* Address review comments

* Calculate RoPE double precisions freqs using torch instead of np

* Further simplify LTX 2 RoPE freq calc

* Make connectors a separate module (#18)

* remove text_encoder.py

* address yiyi's comments.

* up

* up

* up

* up

---------

Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
2026-01-05 20:11:13 +05:30
dg845
aae70b90db Merge pull request #10 from huggingface/make-scheduler-consistent
Make LTX 2.0 Scheduler `sigmas` Consistent with Original Code
2025-12-31 13:46:47 -08:00
sayakpaul
d3f10fe54e test i2v. 2025-12-31 09:36:48 +05:30
dg845
bd607b97a8 Denormalize audio latents in I2V pipeline (analogous to T2V change) (#11) 2025-12-31 09:23:35 +05:30
Daniel Gu
6a236a27fb Merge branch 'ltx-2-transformer' into make-scheduler-consistent 2025-12-30 20:25:59 +01:00
Sayak Paul
46822c43db Add support for I2V (#8)
* start i2v.

* up

* up

* up

* up

* up

* remove uniform strategy code.

* remove unneeded code.
2025-12-30 09:06:07 +05:30
Sayak Paul
280e347814 Refactor Audio VAE to be simpler and remove helpers (#7)
* remove resolve causality axes stuff.

* remove a bunch of helpers.

* remove adjust output shape helper.

* remove the use of audiolatentshape.

* move normalization and patchify out of pipeline.

* fix

* up

* up

* Remove unpatchify and patchify ops before audio latents denormalization (#9)

---------

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
2025-12-30 08:05:56 +05:30
Daniel Gu
e1f0b7e255 Fix typo when applying scheduler fix in T2V inference script 2025-12-30 00:38:51 +01:00
Daniel Gu
581f21c431 Make LTX 2.0 scheduler more consistent with original code 2025-12-29 23:44:52 +01:00
dg845
0c41297453 Merge pull request #4 from huggingface/ltx-2-t2v-pipeline
LTX 2.0 Text-to-Video (T2V) Pipeline
2025-12-23 21:29:25 -08:00
Daniel Gu
b5891b19b1 Get LTX 2 T2V pipeline to produce reasonable outputs 2025-12-24 06:07:38 +01:00
Daniel Gu
e89d9c1951 Fix video shape error in full pipeline test script 2025-12-23 11:14:05 +01:00
Daniel Gu
f9b947651f Fix pipeline audio VAE decoding dtype bug 2025-12-23 11:03:19 +01:00
Daniel Gu
1484c43183 Improve CPU offload support 2025-12-23 10:56:32 +01:00
Daniel Gu
90edc6abc9 Fix more bugs in LTX2Pipeline.__call__ 2025-12-23 10:41:27 +01:00
Daniel Gu
a56cf23483 Add LTX 2 text encoder and vocoder to ltx2 subdirectory __init__ 2025-12-23 10:40:56 +01:00
Daniel Gu
fa7d9f77f1 Fix pipeline return bugs 2025-12-23 08:49:11 +01:00
Daniel Gu
3bf736979f Add script to test full LTX2Pipeline T2V inference 2025-12-23 08:43:37 +01:00
Daniel Gu
595f485ad8 LTX 2.0 scheduler and full pipeline conversion 2025-12-23 07:41:28 +01:00
Daniel Gu
cbb10b8dca Support num_videos_per_prompt for prompt embeddings 2025-12-23 07:01:17 +01:00
Daniel Gu
6e6ce20595 Duplicate scheduler for audio latents 2025-12-23 06:40:35 +01:00
Daniel Gu
54bfc5d617 Add Audio VAE logic to T2V pipeline 2025-12-23 03:51:22 +01:00
Daniel Gu
ae3b6e7cc2 Merge branch 'ltx-2-transformer' into ltx-2-t2v-pipeline 2025-12-23 02:59:33 +01:00
Daniel Gu
d303e2a6ff Conversion script for LTX 2.0 Audio VAE Decoder 2025-12-23 02:48:15 +01:00
Daniel Gu
5f7e43d17f Add imports for LTX 2.0 Audio VAE 2025-12-23 02:08:51 +01:00
dg845
7bb4cf76ce Merge pull request #5 from huggingface/audio-decoder
Audio decoder
2025-12-22 17:00:11 -08:00
sayakpaul
409d651bab resolve conflicts. 2025-12-22 15:59:31 +05:30
sayakpaul
8134da6a56 up 2025-12-22 15:55:29 +05:30
Sayak Paul
059999a3f7 up 2025-12-22 10:24:55 +00:00
sayakpaul
58257eb0e0 up 2025-12-22 15:45:56 +05:30
Sayak Paul
5f0f2a03f7 up 2025-12-22 10:06:39 +00:00
Daniel Gu
d0f9cdaab1 Rough initial LTX 2.0 pipeline implementation 2025-12-22 10:07:20 +01:00
Daniel Gu
0028955c37 Initial LTX 2.0 text encoder implementation 2025-12-22 10:06:01 +01:00
sayakpaul
4904fd6fa5 up 2025-12-22 13:46:58 +05:30
sayakpaul
907896d533 simplify and clean up 2025-12-22 13:41:41 +05:30
sayakpaul
e54cd6bb1d up 2025-12-22 13:03:40 +05:30
sayakpaul
f4c2435d61 init registration. 2025-12-22 12:25:36 +05:30
sayakpaul
b34ddb1736 start audio decoder. 2025-12-22 12:23:31 +05:30
Daniel Gu
6c56954fa8 Use RMSNorm implementation closer to original for LTX 2.0 video VAE 2025-12-20 02:40:38 +01:00
dg845
b1cf6ff8a9 Merge pull request #2 from huggingface/ltx-2-video-vae
LTX 2.0 Video VAE Implementation
2025-12-19 16:36:38 -08:00
dg845
8bfeb4af56 Merge pull request #3 from huggingface/ltx-2-vocoder
LTX 2.0 Vocoder Implementation
2025-12-19 16:21:31 -08:00
Daniel Gu
c6a11a5530 Initial LTX 2.0 vocoder implementation 2025-12-19 12:17:10 +01:00
Daniel Gu
a748975a7c Get diffusers implementation on par with official LTX 2.0 video VAE implementation 2025-12-19 07:02:38 +01:00
Daniel Gu
491aae08d8 Add initial LTX 2.0 video VAE tests (part 2) 2025-12-17 11:39:09 +01:00
Daniel Gu
5b950d6fef Add initial LTX 2.0 video VAE tests 2025-12-17 11:30:15 +01:00
Daniel Gu
baf23e2da3 Explicitly specify temporal and spatial VAE scale factors when converting 2025-12-17 11:14:45 +01:00
Daniel Gu
269cf7b40d Initial implementation of LTX 2.0 video VAE 2025-12-17 10:51:34 +01:00