* Initial LTX 2.0 transformer implementation
* Add tests for LTX 2 transformer model
* Get LTX 2 transformer tests working
* Rename LTX 2 compile test class to have LTX2
* Remove RoPE debug print statements
* Get LTX 2 transformer compile tests passing
* Fix LTX 2 transformer shape errors
* Initial script to convert LTX 2 transformer to diffusers
* Add more LTX 2 transformer audio arguments
* Allow LTX 2 transformer to be loaded from local path for conversion
* Improve dummy inputs and add test for LTX 2 transformer consistency
* Fix LTX 2 transformer bugs so consistency test passes
* Initial implementation of LTX 2.0 video VAE
* Explicitly specify temporal and spatial VAE scale factors when converting
* Add initial LTX 2.0 video VAE tests
* Add initial LTX 2.0 video VAE tests (part 2)
* Get diffusers implementation on par with official LTX 2.0 video VAE implementation
* Initial LTX 2.0 vocoder implementation
* Use RMSNorm implementation closer to original for LTX 2.0 video VAE
* start audio decoder.
* init registration.
* up
* simplify and clean up
* up
* Initial LTX 2.0 text encoder implementation
* Rough initial LTX 2.0 pipeline implementation
* up
* up
* up
* up
* Add imports for LTX 2.0 Audio VAE
* Conversion script for LTX 2.0 Audio VAE Decoder
* Add Audio VAE logic to T2V pipeline
* Duplicate scheduler for audio latents
* Support num_videos_per_prompt for prompt embeddings
* LTX 2.0 scheduler and full pipeline conversion
* Add script to test full LTX2Pipeline T2V inference
* Fix pipeline return bugs
* Add LTX 2 text encoder and vocoder to ltx2 subdirectory __init__
* Fix more bugs in LTX2Pipeline.__call__
* Improve CPU offload support
* Fix pipeline audio VAE decoding dtype bug
* Fix video shape error in full pipeline test script
* Get LTX 2 T2V pipeline to produce reasonable outputs
* Make LTX 2.0 scheduler more consistent with original code
* Fix typo when applying scheduler fix in T2V inference script
* Refactor Audio VAE to be simpler and remove helpers (#7)
* remove resolve causality axes stuff.
* remove a bunch of helpers.
* remove adjust output shape helper.
* remove the use of audiolatentshape.
* move normalization and patchify out of pipeline.
* fix
* up
* up
* Remove unpatchify and patchify ops before audio latents denormalization (#9)
---------
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
* Add support for I2V (#8)
* start i2v.
* up
* up
* up
* up
* up
* remove uniform strategy code.
* remove unneeded code.
* Denormalize audio latents in I2V pipeline (analogous to T2V change) (#11)
* test i2v.
* Move Video and Audio Text Encoder Connectors to Transformer (#12)
* Denormalize audio latents in I2V pipeline (analogous to T2V change)
* Initial refactor to put video and audio text encoder connectors in transformer
* Get LTX 2 transformer tests working after connector refactor
* precompute run_connectors,.
* fixes
* Address review comments
* Calculate RoPE double precisions freqs using torch instead of np
* Further simplify LTX 2 RoPE freq calc
* Make connectors a separate module (#18)
* remove text_encoder.py
* address yiyi's comments.
* up
* up
* up
* up
---------
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
* up (#19)
* address initial feedback from lightricks team (#16)
* cross_attn_timestep_scale_multiplier to 1000
* implement split rope type.
* up
* propagate rope_type to rope embed classes as well.
* up
* When using split RoPE, make sure that the output dtype is same as input dtype
* Fix apply split RoPE shape error when reshaping x to 4D
* Add export_utils file for exporting LTX 2.0 videos with audio
* Tests for T2V and I2V (#6)
* add ltx2 pipeline tests.
* up
* up
* up
* up
* remove content
* style
* Denormalize audio latents in I2V pipeline (analogous to T2V change)
* Initial refactor to put video and audio text encoder connectors in transformer
* Get LTX 2 transformer tests working after connector refactor
* up
* up
* i2v tests.
* up
* Address review comments
* Calculate RoPE double precisions freqs using torch instead of np
* Further simplify LTX 2 RoPE freq calc
* revert unneded changes.
* up
* up
* update to split style rope.
* up
---------
Co-authored-by: Daniel Gu <dgu8957@gmail.com>
* up
* use export util funcs.
* Point original checkpoint to LTX 2.0 official checkpoint
* Allow the I2V pipeline to accept image URLs
* make style and make quality
* remove function map.
* remove args.
* update docs.
* update doc entries.
* disable ltx2_consistency test
* Simplify LTX 2 RoPE forward by removing coords is None logic
* make style and make quality
* Support LTX 2.0 audio VAE encoder
* Apply suggestions from code review
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Remove print statement in audio VAE
* up
* Fix bug when calculating audio RoPE coords
* Ltx 2 latent upsample pipeline (#12922)
* Initial implementation of LTX 2.0 latent upsampling pipeline
* Add new LTX 2.0 spatial latent upsampler logic
* Add test script for LTX 2.0 latent upsampling
* Add option to enable VAE tiling in upsampling test script
* Get latent upsampler working with video latents
* Fix typo in BlurDownsample
* Add latent upsample pipeline docstring and example
* Remove deprecated pipeline VAE slicing/tiling methods
* make style and make quality
* When returning latents, return unpacked and denormalized latents for T2V and I2V
* Add model_cpu_offload_seq for latent upsampling pipeline
---------
Co-authored-by: Daniel Gu <dgu8957@gmail.com>
* Fix latent upsampler filename in LTX 2 conversion script
* Add latent upsample pipeline to LTX 2 docs
* Add dummy objects for LTX 2 latent upsample pipeline
* Set default FPS to official LTX 2 ckpt default of 24.0
* Set default CFG scale to official LTX 2 ckpt default of 4.0
* Update LTX 2 pipeline example docstrings
* make style and make quality
* Remove LTX 2 test scripts
* Fix LTX 2 upsample pipeline example docstring
* Add logic to convert and save a LTX 2 upsampling pipeline
* Document LTX2VideoTransformer3DModel forward pass
---------
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>