diffusers

mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00

Author	SHA1	Message	Date
Daniel Gu	084490cd98	Merge branch 'ltx-2-transformer' into ltx-2-latent-upsample-pipeline	2026-01-06 03:29:38 +01:00
dg845	ce9da5d472	Merge pull request #20 from huggingface/video-export-utils-file Add export_utils file for exporting LTX 2.0 videos with audio	2026-01-05 18:25:29 -08:00
Daniel Gu	90516804e0	Merge branch 'ltx-2-transformer' into ltx-2-latent-upsample-pipeline	2026-01-06 03:18:51 +01:00
Daniel Gu	cb50cacba5	Add export_utils file for exporting LTX 2.0 videos with audio	2026-01-06 02:17:39 +01:00
Daniel Gu	bff989110c	Fix apply split RoPE shape error when reshaping x to 4D	2026-01-06 01:22:05 +01:00
Daniel Gu	2fa4f8471f	When using split RoPE, make sure that the output dtype is same as input dtype	2026-01-06 00:19:39 +01:00
Sayak Paul	c5b52d6c9f	address initial feedback from lightricks team (#16 ) * cross_attn_timestep_scale_multiplier to 1000 * implement split rope type. * up * propagate rope_type to rope embed classes as well. * up	2026-01-05 21:13:10 +05:30
Sayak Paul	0be4f31620	up (#19 )	2026-01-05 21:13:01 +05:30
dg845	caae16768a	Move Video and Audio Text Encoder Connectors to Transformer (#12 ) * Denormalize audio latents in I2V pipeline (analogous to T2V change) * Initial refactor to put video and audio text encoder connectors in transformer * Get LTX 2 transformer tests working after connector refactor * precompute run_connectors,. * fixes * Address review comments * Calculate RoPE double precisions freqs using torch instead of np * Further simplify LTX 2 RoPE freq calc * Make connectors a separate module (#18) * remove text_encoder.py * address yiyi's comments. * up * up * up * up --------- Co-authored-by: sayakpaul <spsayakpaul@gmail.com>	2026-01-05 20:11:13 +05:30
Daniel Gu	fe3ba3b698	Initial implementation of LTX 2.0 latent upsampling pipeline	2026-01-02 20:18:32 +01:00
dg845	aae70b90db	Merge pull request #10 from huggingface/make-scheduler-consistent Make LTX 2.0 Scheduler `sigmas` Consistent with Original Code	2025-12-31 13:46:47 -08:00
sayakpaul	d3f10fe54e	test i2v.	2025-12-31 09:36:48 +05:30
dg845	bd607b97a8	Denormalize audio latents in I2V pipeline (analogous to T2V change) (#11 )	2025-12-31 09:23:35 +05:30
Daniel Gu	6a236a27fb	Merge branch 'ltx-2-transformer' into make-scheduler-consistent	2025-12-30 20:25:59 +01:00
Sayak Paul	46822c43db	Add support for I2V (#8 ) * start i2v. * up * up * up * up * up * remove uniform strategy code. * remove unneeded code.	2025-12-30 09:06:07 +05:30
Sayak Paul	280e347814	Refactor Audio VAE to be simpler and remove helpers (#7 ) * remove resolve causality axes stuff. * remove a bunch of helpers. * remove adjust output shape helper. * remove the use of audiolatentshape. * move normalization and patchify out of pipeline. * fix * up * up * Remove unpatchify and patchify ops before audio latents denormalization (#9) --------- Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>	2025-12-30 08:05:56 +05:30
Daniel Gu	e1f0b7e255	Fix typo when applying scheduler fix in T2V inference script	2025-12-30 00:38:51 +01:00
Daniel Gu	581f21c431	Make LTX 2.0 scheduler more consistent with original code	2025-12-29 23:44:52 +01:00
dg845	0c41297453	Merge pull request #4 from huggingface/ltx-2-t2v-pipeline LTX 2.0 Text-to-Video (T2V) Pipeline	2025-12-23 21:29:25 -08:00
Daniel Gu	b5891b19b1	Get LTX 2 T2V pipeline to produce reasonable outputs	2025-12-24 06:07:38 +01:00
Daniel Gu	e89d9c1951	Fix video shape error in full pipeline test script	2025-12-23 11:14:05 +01:00
Daniel Gu	f9b947651f	Fix pipeline audio VAE decoding dtype bug	2025-12-23 11:03:19 +01:00
Daniel Gu	1484c43183	Improve CPU offload support	2025-12-23 10:56:32 +01:00
Daniel Gu	90edc6abc9	Fix more bugs in LTX2Pipeline.__call__	2025-12-23 10:41:27 +01:00
Daniel Gu	a56cf23483	Add LTX 2 text encoder and vocoder to ltx2 subdirectory __init__	2025-12-23 10:40:56 +01:00
Daniel Gu	fa7d9f77f1	Fix pipeline return bugs	2025-12-23 08:49:11 +01:00
Daniel Gu	3bf736979f	Add script to test full LTX2Pipeline T2V inference	2025-12-23 08:43:37 +01:00
Daniel Gu	595f485ad8	LTX 2.0 scheduler and full pipeline conversion	2025-12-23 07:41:28 +01:00
Daniel Gu	cbb10b8dca	Support num_videos_per_prompt for prompt embeddings	2025-12-23 07:01:17 +01:00
Daniel Gu	6e6ce20595	Duplicate scheduler for audio latents	2025-12-23 06:40:35 +01:00
Daniel Gu	54bfc5d617	Add Audio VAE logic to T2V pipeline	2025-12-23 03:51:22 +01:00
Daniel Gu	ae3b6e7cc2	Merge branch 'ltx-2-transformer' into ltx-2-t2v-pipeline	2025-12-23 02:59:33 +01:00
Daniel Gu	d303e2a6ff	Conversion script for LTX 2.0 Audio VAE Decoder	2025-12-23 02:48:15 +01:00
Daniel Gu	5f7e43d17f	Add imports for LTX 2.0 Audio VAE	2025-12-23 02:08:51 +01:00
dg845	7bb4cf76ce	Merge pull request #5 from huggingface/audio-decoder Audio decoder	2025-12-22 17:00:11 -08:00
sayakpaul	409d651bab	resolve conflicts.	2025-12-22 15:59:31 +05:30
sayakpaul	8134da6a56	up	2025-12-22 15:55:29 +05:30
Sayak Paul	059999a3f7	up	2025-12-22 10:24:55 +00:00
sayakpaul	58257eb0e0	up	2025-12-22 15:45:56 +05:30
Sayak Paul	5f0f2a03f7	up	2025-12-22 10:06:39 +00:00
Daniel Gu	d0f9cdaab1	Rough initial LTX 2.0 pipeline implementation	2025-12-22 10:07:20 +01:00
Daniel Gu	0028955c37	Initial LTX 2.0 text encoder implementation	2025-12-22 10:06:01 +01:00
sayakpaul	4904fd6fa5	up	2025-12-22 13:46:58 +05:30
sayakpaul	907896d533	simplify and clean up	2025-12-22 13:41:41 +05:30
sayakpaul	e54cd6bb1d	up	2025-12-22 13:03:40 +05:30
sayakpaul	f4c2435d61	init registration.	2025-12-22 12:25:36 +05:30
sayakpaul	b34ddb1736	start audio decoder.	2025-12-22 12:23:31 +05:30
Daniel Gu	6c56954fa8	Use RMSNorm implementation closer to original for LTX 2.0 video VAE	2025-12-20 02:40:38 +01:00
dg845	b1cf6ff8a9	Merge pull request #2 from huggingface/ltx-2-video-vae LTX 2.0 Video VAE Implementation	2025-12-19 16:36:38 -08:00
dg845	8bfeb4af56	Merge pull request #3 from huggingface/ltx-2-vocoder LTX 2.0 Vocoder Implementation	2025-12-19 16:21:31 -08:00

1 2 3 4 5 ...

6134 Commits