* Adding torchao version guard for floatx usage
Summary: TorchAO removing floatx support, added version guard in quantization_config.py
* Adding torchao version guard for floatx usage
Summary: TorchAO removing floatx support, added version guard in quantization_config.py
Altered tests in test_torchao.py to version guard floatx
Created new test to verify version guard of floatx support
* Adding torchao version guard for floatx usage
Summary: TorchAO removing floatx support, added version guard in quantization_config.py
Altered tests in test_torchao.py to version guard floatx
Created new test to verify version guard of floatx support
* Adding torchao version guard for floatx usage
Summary: TorchAO removing floatx support, added version guard in quantization_config.py
Altered tests in test_torchao.py to version guard floatx
Created new test to verify version guard of floatx support
* Adding torchao version guard for floatx usage
Summary: TorchAO removing floatx support, added version guard in quantization_config.py
Altered tests in test_torchao.py to version guard floatx
Created new test to verify version guard of floatx support
* Adding torchao version guard for floatx usage
Summary: TorchAO removing floatx support, added version guard in quantization_config.py
Altered tests in test_torchao.py to version guard floatx
Created new test to verify version guard of floatx support
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Initial LTX 2.0 transformer implementation
* Add tests for LTX 2 transformer model
* Get LTX 2 transformer tests working
* Rename LTX 2 compile test class to have LTX2
* Remove RoPE debug print statements
* Get LTX 2 transformer compile tests passing
* Fix LTX 2 transformer shape errors
* Initial script to convert LTX 2 transformer to diffusers
* Add more LTX 2 transformer audio arguments
* Allow LTX 2 transformer to be loaded from local path for conversion
* Improve dummy inputs and add test for LTX 2 transformer consistency
* Fix LTX 2 transformer bugs so consistency test passes
* Initial implementation of LTX 2.0 video VAE
* Explicitly specify temporal and spatial VAE scale factors when converting
* Add initial LTX 2.0 video VAE tests
* Add initial LTX 2.0 video VAE tests (part 2)
* Get diffusers implementation on par with official LTX 2.0 video VAE implementation
* Initial LTX 2.0 vocoder implementation
* Use RMSNorm implementation closer to original for LTX 2.0 video VAE
* start audio decoder.
* init registration.
* up
* simplify and clean up
* up
* Initial LTX 2.0 text encoder implementation
* Rough initial LTX 2.0 pipeline implementation
* up
* up
* up
* up
* Add imports for LTX 2.0 Audio VAE
* Conversion script for LTX 2.0 Audio VAE Decoder
* Add Audio VAE logic to T2V pipeline
* Duplicate scheduler for audio latents
* Support num_videos_per_prompt for prompt embeddings
* LTX 2.0 scheduler and full pipeline conversion
* Add script to test full LTX2Pipeline T2V inference
* Fix pipeline return bugs
* Add LTX 2 text encoder and vocoder to ltx2 subdirectory __init__
* Fix more bugs in LTX2Pipeline.__call__
* Improve CPU offload support
* Fix pipeline audio VAE decoding dtype bug
* Fix video shape error in full pipeline test script
* Get LTX 2 T2V pipeline to produce reasonable outputs
* Make LTX 2.0 scheduler more consistent with original code
* Fix typo when applying scheduler fix in T2V inference script
* Refactor Audio VAE to be simpler and remove helpers (#7)
* remove resolve causality axes stuff.
* remove a bunch of helpers.
* remove adjust output shape helper.
* remove the use of audiolatentshape.
* move normalization and patchify out of pipeline.
* fix
* up
* up
* Remove unpatchify and patchify ops before audio latents denormalization (#9)
---------
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
* Add support for I2V (#8)
* start i2v.
* up
* up
* up
* up
* up
* remove uniform strategy code.
* remove unneeded code.
* Denormalize audio latents in I2V pipeline (analogous to T2V change) (#11)
* test i2v.
* Move Video and Audio Text Encoder Connectors to Transformer (#12)
* Denormalize audio latents in I2V pipeline (analogous to T2V change)
* Initial refactor to put video and audio text encoder connectors in transformer
* Get LTX 2 transformer tests working after connector refactor
* precompute run_connectors,.
* fixes
* Address review comments
* Calculate RoPE double precisions freqs using torch instead of np
* Further simplify LTX 2 RoPE freq calc
* Make connectors a separate module (#18)
* remove text_encoder.py
* address yiyi's comments.
* up
* up
* up
* up
---------
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
* up (#19)
* address initial feedback from lightricks team (#16)
* cross_attn_timestep_scale_multiplier to 1000
* implement split rope type.
* up
* propagate rope_type to rope embed classes as well.
* up
* When using split RoPE, make sure that the output dtype is same as input dtype
* Fix apply split RoPE shape error when reshaping x to 4D
* Add export_utils file for exporting LTX 2.0 videos with audio
* Tests for T2V and I2V (#6)
* add ltx2 pipeline tests.
* up
* up
* up
* up
* remove content
* style
* Denormalize audio latents in I2V pipeline (analogous to T2V change)
* Initial refactor to put video and audio text encoder connectors in transformer
* Get LTX 2 transformer tests working after connector refactor
* up
* up
* i2v tests.
* up
* Address review comments
* Calculate RoPE double precisions freqs using torch instead of np
* Further simplify LTX 2 RoPE freq calc
* revert unneded changes.
* up
* up
* update to split style rope.
* up
---------
Co-authored-by: Daniel Gu <dgu8957@gmail.com>
* up
* use export util funcs.
* Point original checkpoint to LTX 2.0 official checkpoint
* Allow the I2V pipeline to accept image URLs
* make style and make quality
* remove function map.
* remove args.
* update docs.
* update doc entries.
* disable ltx2_consistency test
* Simplify LTX 2 RoPE forward by removing coords is None logic
* make style and make quality
* Support LTX 2.0 audio VAE encoder
* Apply suggestions from code review
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Remove print statement in audio VAE
* up
* Fix bug when calculating audio RoPE coords
* Ltx 2 latent upsample pipeline (#12922)
* Initial implementation of LTX 2.0 latent upsampling pipeline
* Add new LTX 2.0 spatial latent upsampler logic
* Add test script for LTX 2.0 latent upsampling
* Add option to enable VAE tiling in upsampling test script
* Get latent upsampler working with video latents
* Fix typo in BlurDownsample
* Add latent upsample pipeline docstring and example
* Remove deprecated pipeline VAE slicing/tiling methods
* make style and make quality
* When returning latents, return unpacked and denormalized latents for T2V and I2V
* Add model_cpu_offload_seq for latent upsampling pipeline
---------
Co-authored-by: Daniel Gu <dgu8957@gmail.com>
* Fix latent upsampler filename in LTX 2 conversion script
* Add latent upsample pipeline to LTX 2 docs
* Add dummy objects for LTX 2 latent upsample pipeline
* Set default FPS to official LTX 2 ckpt default of 24.0
* Set default CFG scale to official LTX 2 ckpt default of 4.0
* Update LTX 2 pipeline example docstrings
* make style and make quality
* Remove LTX 2 test scripts
* Fix LTX 2 upsample pipeline example docstring
* Add logic to convert and save a LTX 2 upsampling pipeline
* Document LTX2VideoTransformer3DModel forward pass
---------
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
* cosmos predict2.5 base: convert chkpt & pipeline
- New scheduler: scheduling_flow_unipc_multistep.py
- Changes to TransformerCosmos for text embeddings via crossattn_proj
* scheduler cleanup
* simplify inference pipeline
* cleanup scheduler + tests
* Basic tests for flow unipc
* working b2b inference
* Rename everything
* Tests for pipeline present, but not working (predict2 also not working)
* docstring update
* wrapper pipelines + make style
* remove unnecessary files
* UniPCMultistep: support use_karras_sigmas=True and use_flow_sigmas=True
* use UniPCMultistepScheduler + fix tests for pipeline
* Remove FlowUniPCMultistepScheduler
* UniPCMultistepScheduler for use_flow_sigmas=True & use_karras_sigmas=True
* num_inference_steps=36 due to bug in scheduler used by predict2.5
* Address comments
* make style + make fix-copies
* fix tests + remove references to old pipelines
* address comments
* add revision in from_pretrained call
* fix tests
* extend TorchAoTest::test_model_memory_usage to other platform
Signe-off-by: Wang, Yi <yi.a.wang@inel.com>
* add some comments
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
---------
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
fix pytest tests/pipelines/pixart_sigma/test_pixart.py::PixArtSigmaPipelineIntegrationTests::test_pixart_512 in xpu
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Add ZImageImg2ImgPipeline
Updated the pipeline structure to include ZImageImg2ImgPipeline
alongside ZImagePipeline.
Implemented the ZImageImg2ImgPipeline class for image-to-image
transformations, including necessary methods for
encoding prompts, preparing latents, and denoising.
Enhanced the auto_pipeline to map the new ZImageImg2ImgPipeline
for image generation tasks.
Added unit tests for ZImageImg2ImgPipeline to ensure
functionality and performance.
Updated dummy objects to include ZImageImg2ImgPipeline for
testing purposes.
* Address review comments for ZImageImg2ImgPipeline
- Add `# Copied from` annotations to encode_prompt and _encode_prompt
- Add ZImagePipeline to auto_pipeline.py for AutoPipeline support
* Add ZImage pipeline documentation
---------
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
* fix: group offloading to support standalone computational layers in block-level offloading
* test: for models with standalone and deeply nested layers in block-level offloading
* feat: support for block-level offloading in group offloading config
* fix: group offload block modules to AutoencoderKL and AutoencoderKLWan
* fix: update group offloading tests to use AutoencoderKL and adjust input dimensions
* refactor: streamline block offloading logic
* Apply style fixes
* update tests
* update
* fix for failing tests
* clean up
* revert to use skip_keys
* clean up
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
* start zimage model tests.
* up
* up
* up
* up
* up
* up
* up
* up
* up
* up
* up
* up
* Revert "up"
This reverts commit bca3e27c96.
* expand upon compilation failure reason.
* Update tests/models/transformers/test_models_transformer_z_image.py
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
* reinitialize the padding tokens to ones to prevent NaN problems.
* updates
* up
* skipping ZImage DiT tests
* up
* up
---------
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
* Fix(peft): Re-apply group offloading after deleting adapters
* Test: Add regression test for group offloading + delete_adapters
* Test: Add assertions to verify output changes after deletion
* Test: Add try/finally to clean up group offloading hooks
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Fixes#12673.
Wrong default_stream is used. leading to wrong execution order when record_steram is enabled.
* update
* Update test
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Add ZImage LoRA support and integrate into ZImagePipeline
* Add LoRA test for Z-Image
* Move the LoRA test
* Fix ZImage LoRA scale support and test configuration
* Add ZImage LoRA test overrides for architecture differences
- Override test_lora_fuse_nan to use ZImage's 'layers' attribute
instead of 'transformer_blocks'
- Skip block-level LoRA scaling test (not supported in ZImage)
- Add required imports: numpy, torch_device, check_if_lora_correctly_set
* Add ZImageLoraLoaderMixin to LoRA documentation
* Use conditional import for peft.LoraConfig in ZImage tests
* Override test_correct_lora_configs_with_different_ranks for ZImage
ZImage uses 'attention.to_k' naming convention instead of 'attn.to_k',
so the base test's module name search loop never finds a match. This
override uses the correct naming pattern for ZImage architecture.
* Add is_flaky decorator to ZImage LoRA tests initialise padding tokens
* Skip ZImage LoRA test class entirely
Skip the entire ZImageLoRATests class due to non-deterministic behavior
from complex64 RoPE operations and torch.empty padding tokens.
LoRA functionality works correctly with real models.
Clean up removed:
- Individual @unittest.skip decorators
- @is_flaky decorator overrides for inherited methods
- Custom test method overrides
- Global torch deterministic settings
- Unused imports (numpy, is_flaky, check_if_lora_correctly_set)
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
* Add Support for Z-Image.
* Reformatting with make style, black & isort.
* Remove init, Modify import utils, Merge forward in transformers block, Remove once func in pipeline.
* modified main model forward, freqs_cis left
* refactored to add B dim
* fixed stack issue
* fixed modulation bug
* fixed modulation bug
* fix bug
* remove value_from_time_aware_config
* styling
* Fix neg embed and devide / bug; Reuse pad zero tensor; Turn cat -> repeat; Add hint for attn processor.
* Replace padding with pad_sequence; Add gradient checkpointing.
* Fix flash_attn3 in dispatch attn backend by _flash_attn_forward, replace its origin implement; Add DocString in pipeline for that.
* Fix Docstring and Make Style.
* Revert "Fix flash_attn3 in dispatch attn backend by _flash_attn_forward, replace its origin implement; Add DocString in pipeline for that."
This reverts commit fbf26b7ed1.
* update z-image docstring
* Revert attention dispatcher
* update z-image docstring
* styling
* Recover attention_dispatch.py with its origin impl, later would special commit for fa3 compatibility.
* Fix prev bug, and support for prompt_embeds pass in args after prompt pre-encode as List of torch Tensor.
* Remove einop dependency.
* remove redundant imports & make fix-copies
* fix import
* Support for num_images_per_prompt>1; Remove redundant unquote variables.
* Fix bugs for num_images_per_prompt with actual batch.
* Add unit tests for Z-Image.
* Refine unitest and skip for cases needed separate test env; Fix compatibility with unitest in model, mostly precision formating.
* Add clean env for test_save_load_float16 separ test; Add Note; Styling.
* Update dtype mentioned by yiyi.
---------
Co-authored-by: liudongyang <liudongyang0114@gmail.com>
* add vae
* Initial commit for Flux 2 Transformer implementation
* add pipeline part
* small edits to the pipeline and conversion
* update conversion script
* fix
* up up
* finish pipeline
* Remove Flux IP Adapter logic for now
* Remove deprecated 3D id logic
* Remove ControlNet logic for now
* Add link to ViT-22B paper as reference for parallel transformer blocks such as the Flux 2 single stream block
* update pipeline
* Don't use biases for input projs and output AdaNorm
* up
* Remove bias for double stream block text QKV projections
* Add script to convert Flux 2 transformer to diffusers
* make style and make quality
* fix a few things.
* allow sft files to go.
* fix image processor
* fix batch
* style a bit
* Fix some bugs in Flux 2 transformer implementation
* Fix dummy input preparation and fix some test bugs
* fix dtype casting in timestep guidance module.
* resolve conflicts.,
* remove ip adapter stuff.
* Fix Flux 2 transformer consistency test
* Fix bug in Flux2TransformerBlock (double stream block)
* Get remaining Flux 2 transformer tests passing
* make style; make quality; make fix-copies
* remove stuff.
* fix type annotaton.
* remove unneeded stuff from tests
* tests
* up
* up
* add sf support
* Remove unused IP Adapter and ControlNet logic from transformer (#9)
* copied from
* Apply suggestions from code review
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: apolinário <joaopaulo.passos@gmail.com>
* up
* up
* up
* up
* up
* Refactor Flux2Attention into separate classes for double stream and single stream attention
* Add _supports_qkv_fusion to AttentionModuleMixin to allow subclasses to disable QKV fusion
* Have Flux2ParallelSelfAttention inherit from AttentionModuleMixin with _supports_qkv_fusion=False
* Log debug message when calling fuse_projections on a AttentionModuleMixin subclass that does not support QKV fusion
* Address review comments
* Update src/diffusers/pipelines/flux2/pipeline_flux2.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>
* up
* Remove maybe_allow_in_graph decorators for Flux 2 transformer blocks (#12)
* up
* support ostris loras. (#13)
* up
* update schdule
* up
* up (#17)
* add training scripts (#16)
* add training scripts
Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com>
* model cpu offload in validation.
* add flux.2 readme
* add img2img and tests
* cpu offload in log validation
* Apply suggestions from code review
* fix
* up
* fixes
* remove i2i training tests for now.
---------
Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com>
Co-authored-by: linoytsaban <linoy@huggingface.co>
* up
---------
Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: Daniel Gu <dgu8957@gmail.com>
Co-authored-by: yiyi@huggingface.co <yiyi@ip-10-53-87-203.ec2.internal>
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: apolinário <joaopaulo.passos@gmail.com>
Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal>
Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com>
Co-authored-by: linoytsaban <linoy@huggingface.co>
* Add Support for Z-Image.
* Reformatting with make style, black & isort.
* Remove init, Modify import utils, Merge forward in transformers block, Remove once func in pipeline.
* modified main model forward, freqs_cis left
* refactored to add B dim
* fixed stack issue
* fixed modulation bug
* fixed modulation bug
* fix bug
* remove value_from_time_aware_config
* styling
* Fix neg embed and devide / bug; Reuse pad zero tensor; Turn cat -> repeat; Add hint for attn processor.
* Replace padding with pad_sequence; Add gradient checkpointing.
* Fix flash_attn3 in dispatch attn backend by _flash_attn_forward, replace its origin implement; Add DocString in pipeline for that.
* Fix Docstring and Make Style.
* Revert "Fix flash_attn3 in dispatch attn backend by _flash_attn_forward, replace its origin implement; Add DocString in pipeline for that."
This reverts commit fbf26b7ed1.
* update z-image docstring
* Revert attention dispatcher
* update z-image docstring
* styling
* Recover attention_dispatch.py with its origin impl, later would special commit for fa3 compatibility.
* Fix prev bug, and support for prompt_embeds pass in args after prompt pre-encode as List of torch Tensor.
* Remove einop dependency.
* remove redundant imports & make fix-copies
* fix import
---------
Co-authored-by: liudongyang <liudongyang0114@gmail.com>
* add tests for qwenimage modular.
* qwenimage edit.
* qwenimage edit plus.
* empty
* align with the latest structure
* up
* up
* reason
* up
* fix multiple issues.
* up
* up
* fix
* up
* make it similar to the original pipeline.
* Bria FIBO pipeline
* style fixs
* fix CR
* Refactor BriaFibo classes and update pipeline parameters
- Updated BriaFiboAttnProcessor and BriaFiboAttention classes to reflect changes from Flux equivalents.
- Modified the _unpack_latents method in BriaFiboPipeline to improve clarity.
- Increased the default max_sequence_length to 3000 and added a new optional parameter do_patching.
- Cleaned up test_pipeline_bria_fibo.py by removing unused imports and skipping unsupported tests.
* edit the docs of FIBO
* Remove unused BriaFibo imports and update CPU offload method in BriaFiboPipeline
* Refactor FIBO classes to BriaFibo naming convention
- Updated class names from FIBO to BriaFibo for consistency across the module.
- Modified instances of FIBOEmbedND, FIBOTimesteps, TextProjection, and TimestepProjEmbeddings to reflect the new naming.
- Ensured all references in the BriaFiboTransformer2DModel are updated accordingly.
* Add BriaFiboTransformer2DModel import to transformers module
* Remove unused BriaFibo imports from modular pipelines and add BriaFiboTransformer2DModel and BriaFiboPipeline classes to dummy objects for enhanced compatibility with torch and transformers.
* Update BriaFibo classes with copied documentation and fix import typo in pipeline module
- Added documentation comments indicating the source of copied code in BriaFiboTransformerBlock and _pack_latents methods.
- Corrected the import statement for BriaFiboPipeline in the pipelines module.
* Remove unused BriaFibo imports from __init__.py to streamline modular pipelines.
* Refactor documentation comments in BriaFibo classes to indicate inspiration from existing implementations
- Updated comments in BriaFiboAttnProcessor, BriaFiboAttention, and BriaFiboPipeline to reflect that the code is inspired by other modules rather than copied.
- Enhanced clarity on the origins of the methods to maintain proper attribution.
* change Inspired by to Based on
* add reference link and fix trailing whitespace
* Add BriaFiboTransformer2DModel documentation and update comments in BriaFibo classes
- Introduced a new documentation file for BriaFiboTransformer2DModel.
- Updated comments in BriaFiboAttnProcessor, BriaFiboAttention, and BriaFiboPipeline to clarify the origins of the code, indicating copied sources for better attribution.
---------
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
* rename photon to prx
* rename photon into prx
* Revert .gitignore to state before commit b7fb0fe9d6
* rename photon to prx
* rename photon into prx
* Revert .gitignore to state before commit b7fb0fe9d6
* make fix-copies