diffusers

mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00

Author	SHA1	Message	Date
sayakpaul	765eb50ff1	up	2026-01-15 08:50:35 +05:30
sayakpaul	7ad97d492d	resolve conflicts.	2026-01-15 08:32:22 +05:30
Marc Sun	d7fa445453	Remove 8bit device restriction (#12972 ) * allow to * update version * fix version again * again * Update src/diffusers/pipelines/pipeline_utils.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * style * xfail * add pr --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2026-01-14 20:33:30 +05:30
Sayak Paul	7feb4fc791	[chore] make transformers version check stricter for glm image. (#12974 ) * make transformers version check stricter for glm image. * public checkpoint.	2026-01-14 10:29:48 +05:30
Sayak Paul	7299121413	Z rz rz rz rz rz rz r cogview (#12973 ) * init * add * add 1 * Update __init__.py * rename * 2 * update * init with encoder * merge2pipeline * Update pipeline_glm_image.py * remove sop * remove useless func * Update pipeline_glm_image.py * up (cherry picked from commit `cfe19a31b9`) * review for work only * change place * Update pipeline_glm_image.py * update * Update transformer_glm_image.py * 1 * no negative_prompt for GLM-Image * remove CogView4LoraLoaderMixin * refactor attention processor. * update * fix * use staticmethod * update * up * up * update * Update glm_image.md * 1 * Update pipeline_glm_image.py * Update transformer_glm_image.py * using new transformers impl * support * resolution change * fix-copies * Update src/diffusers/pipelines/glm_image/pipeline_glm_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update pipeline_glm_image.py * use cogview4 * Update pipeline_glm_image.py * Update pipeline_glm_image.py * revert * update * batch support * update * version guard glm image pipeline * validate prompt_embeds and prior_token_ids * try docs. * 4 * up * up * skip properly * fix tests * up * up --------- Co-authored-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: yiyixuxu <yixu310@gmail.com>	2026-01-13 06:39:22 -10:00
sayakpaul	987412b252	up	2026-01-13 10:19:08 +05:30
sayakpaul	1426c33aa5	up	2026-01-13 09:13:46 +05:30
Sayak Paul	34388bdaa8	Merge branch 'main' into remove-explicit-typing	2026-01-13 09:00:49 +05:30
sayakpaul	5ee4e19c58	handle modern types.	2026-01-13 09:00:34 +05:30
dg845	f1a93c765f	Add Flag to `PeftLoraLoaderMixinTests` to Enable/Disable Text Encoder LoRA Tests (#12962 ) * Improve incorrect LoRA format error message * Add flag in PeftLoraLoaderMixinTests to disable text encoder LoRA tests * Apply changes to LTX2LoraTests * Further improve incorrect LoRA format error msg following review --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2026-01-12 16:01:58 -08:00
sayakpaul	a9af091700	up	2026-01-12 14:03:28 +05:30
sayakpaul	db627652b1	up	2026-01-12 13:57:55 +05:30
Kashif Rasul	dad5cb55e6	Fix QwenImage txt_seq_lens handling (#12702 ) * Fix QwenImage txt_seq_lens handling * formatting * formatting * remove txt_seq_lens and use bool mask * use compute_text_seq_len_from_mask * add seq_lens to dispatch_attention_fn * use joint_seq_lens * remove unused index_block * WIP: Remove seq_lens parameter and use mask-based approach - Remove seq_lens parameter from dispatch_attention_fn - Update varlen backends to extract seqlens from masks - Update QwenImage to pass 2D joint_attention_mask - Fix native backend to handle 2D boolean masks - Fix sage_varlen seqlens_q to match seqlens_k for self-attention Note: sage_varlen still producing black images, needs further investigation * fix formatting * undo sage changes * xformers support * hub fix * fix torch compile issues * fix tests * use _prepare_attn_mask_native * proper deprecation notice * add deprecate to txt_seq_lens * Update src/diffusers/models/transformers/transformer_qwenimage.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_qwenimage.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Only create the mask if there's actual padding * fix order of docstrings * Adds performance benchmarks and optimization details for QwenImage Enhances documentation with comprehensive performance insights for QwenImage pipeline: * rope_text_seq_len = text_seq_len * rename to max_txt_seq_len * removed deprecated args * undo unrelated change * Updates QwenImage performance documentation Removes detailed attention backend benchmarks and simplifies torch.compile performance description Focuses on key performance improvement with torch.compile, highlighting the specific speedup from 4.70s to 1.93s on an A100 GPU Streamlines the documentation to provide more concise and actionable performance insights * Updates deprecation warnings for txt_seq_lens parameter Extends deprecation timeline for txt_seq_lens from version 0.37.0 to 0.39.0 across multiple Qwen image-related models Adds a new unit test to verify the deprecation warning behavior for the txt_seq_lens parameter * fix compile * formatting * fix compile tests * rename helper * remove duplicate * smaller values * removed * use torch.cond for torch compile * Construct joint attention mask once * test different backends * construct joint attention mask once to avoid reconstructing in every block * Update src/diffusers/models/attention_dispatch.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * formatting * raising an error from the EditPlus pipeline when batch_size > 1 --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: cdutr <dutra_carlos@hotmail.com>	2026-01-12 13:45:09 +05:30
Sayak Paul	ed6e5ecf67	[LoRA] add LoRA support to LTX-2 (#12933 ) * up * fixes * tests * docs. * fix * change loading info. * up * up	2026-01-10 11:27:22 +05:30
Sayak Paul	3981c955ce	[modular] Tests for custom blocks in modular diffusers (#12557 ) * start custom block testing. * simplify modular workflow ci. * up * style. * up * up * up * up * up * up * Apply suggestions from code review * up	2026-01-09 15:57:23 +05:30
Howard Zhang	2f66edc880	Torchao floatx version guard (#12923 ) * Adding torchao version guard for floatx usage Summary: TorchAO removing floatx support, added version guard in quantization_config.py * Adding torchao version guard for floatx usage Summary: TorchAO removing floatx support, added version guard in quantization_config.py Altered tests in test_torchao.py to version guard floatx Created new test to verify version guard of floatx support * Adding torchao version guard for floatx usage Summary: TorchAO removing floatx support, added version guard in quantization_config.py Altered tests in test_torchao.py to version guard floatx Created new test to verify version guard of floatx support * Adding torchao version guard for floatx usage Summary: TorchAO removing floatx support, added version guard in quantization_config.py Altered tests in test_torchao.py to version guard floatx Created new test to verify version guard of floatx support * Adding torchao version guard for floatx usage Summary: TorchAO removing floatx support, added version guard in quantization_config.py Altered tests in test_torchao.py to version guard floatx Created new test to verify version guard of floatx support * Adding torchao version guard for floatx usage Summary: TorchAO removing floatx support, added version guard in quantization_config.py Altered tests in test_torchao.py to version guard floatx Created new test to verify version guard of floatx support --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2026-01-09 10:51:53 +05:30
dg845	c10bdd9b73	Add LTX 2.0 Video Pipelines (#12915 ) * Initial LTX 2.0 transformer implementation * Add tests for LTX 2 transformer model * Get LTX 2 transformer tests working * Rename LTX 2 compile test class to have LTX2 * Remove RoPE debug print statements * Get LTX 2 transformer compile tests passing * Fix LTX 2 transformer shape errors * Initial script to convert LTX 2 transformer to diffusers * Add more LTX 2 transformer audio arguments * Allow LTX 2 transformer to be loaded from local path for conversion * Improve dummy inputs and add test for LTX 2 transformer consistency * Fix LTX 2 transformer bugs so consistency test passes * Initial implementation of LTX 2.0 video VAE * Explicitly specify temporal and spatial VAE scale factors when converting * Add initial LTX 2.0 video VAE tests * Add initial LTX 2.0 video VAE tests (part 2) * Get diffusers implementation on par with official LTX 2.0 video VAE implementation * Initial LTX 2.0 vocoder implementation * Use RMSNorm implementation closer to original for LTX 2.0 video VAE * start audio decoder. * init registration. * up * simplify and clean up * up * Initial LTX 2.0 text encoder implementation * Rough initial LTX 2.0 pipeline implementation * up * up * up * up * Add imports for LTX 2.0 Audio VAE * Conversion script for LTX 2.0 Audio VAE Decoder * Add Audio VAE logic to T2V pipeline * Duplicate scheduler for audio latents * Support num_videos_per_prompt for prompt embeddings * LTX 2.0 scheduler and full pipeline conversion * Add script to test full LTX2Pipeline T2V inference * Fix pipeline return bugs * Add LTX 2 text encoder and vocoder to ltx2 subdirectory __init__ * Fix more bugs in LTX2Pipeline.__call__ * Improve CPU offload support * Fix pipeline audio VAE decoding dtype bug * Fix video shape error in full pipeline test script * Get LTX 2 T2V pipeline to produce reasonable outputs * Make LTX 2.0 scheduler more consistent with original code * Fix typo when applying scheduler fix in T2V inference script * Refactor Audio VAE to be simpler and remove helpers (#7) * remove resolve causality axes stuff. * remove a bunch of helpers. * remove adjust output shape helper. * remove the use of audiolatentshape. * move normalization and patchify out of pipeline. * fix * up * up * Remove unpatchify and patchify ops before audio latents denormalization (#9) --------- Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Add support for I2V (#8) * start i2v. * up * up * up * up * up * remove uniform strategy code. * remove unneeded code. * Denormalize audio latents in I2V pipeline (analogous to T2V change) (#11) * test i2v. * Move Video and Audio Text Encoder Connectors to Transformer (#12) * Denormalize audio latents in I2V pipeline (analogous to T2V change) * Initial refactor to put video and audio text encoder connectors in transformer * Get LTX 2 transformer tests working after connector refactor * precompute run_connectors,. * fixes * Address review comments * Calculate RoPE double precisions freqs using torch instead of np * Further simplify LTX 2 RoPE freq calc * Make connectors a separate module (#18) * remove text_encoder.py * address yiyi's comments. * up * up * up * up --------- Co-authored-by: sayakpaul <spsayakpaul@gmail.com> * up (#19) * address initial feedback from lightricks team (#16) * cross_attn_timestep_scale_multiplier to 1000 * implement split rope type. * up * propagate rope_type to rope embed classes as well. * up * When using split RoPE, make sure that the output dtype is same as input dtype * Fix apply split RoPE shape error when reshaping x to 4D * Add export_utils file for exporting LTX 2.0 videos with audio * Tests for T2V and I2V (#6) * add ltx2 pipeline tests. * up * up * up * up * remove content * style * Denormalize audio latents in I2V pipeline (analogous to T2V change) * Initial refactor to put video and audio text encoder connectors in transformer * Get LTX 2 transformer tests working after connector refactor * up * up * i2v tests. * up * Address review comments * Calculate RoPE double precisions freqs using torch instead of np * Further simplify LTX 2 RoPE freq calc * revert unneded changes. * up * up * update to split style rope. * up --------- Co-authored-by: Daniel Gu <dgu8957@gmail.com> * up * use export util funcs. * Point original checkpoint to LTX 2.0 official checkpoint * Allow the I2V pipeline to accept image URLs * make style and make quality * remove function map. * remove args. * update docs. * update doc entries. * disable ltx2_consistency test * Simplify LTX 2 RoPE forward by removing coords is None logic * make style and make quality * Support LTX 2.0 audio VAE encoder * Apply suggestions from code review Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Remove print statement in audio VAE * up * Fix bug when calculating audio RoPE coords * Ltx 2 latent upsample pipeline (#12922) * Initial implementation of LTX 2.0 latent upsampling pipeline * Add new LTX 2.0 spatial latent upsampler logic * Add test script for LTX 2.0 latent upsampling * Add option to enable VAE tiling in upsampling test script * Get latent upsampler working with video latents * Fix typo in BlurDownsample * Add latent upsample pipeline docstring and example * Remove deprecated pipeline VAE slicing/tiling methods * make style and make quality * When returning latents, return unpacked and denormalized latents for T2V and I2V * Add model_cpu_offload_seq for latent upsampling pipeline --------- Co-authored-by: Daniel Gu <dgu8957@gmail.com> * Fix latent upsampler filename in LTX 2 conversion script * Add latent upsample pipeline to LTX 2 docs * Add dummy objects for LTX 2 latent upsample pipeline * Set default FPS to official LTX 2 ckpt default of 24.0 * Set default CFG scale to official LTX 2 ckpt default of 4.0 * Update LTX 2 pipeline example docstrings * make style and make quality * Remove LTX 2 test scripts * Fix LTX 2 upsample pipeline example docstring * Add logic to convert and save a LTX 2 upsampling pipeline * Document LTX2VideoTransformer3DModel forward pass --------- Co-authored-by: sayakpaul <spsayakpaul@gmail.com>	2026-01-07 21:24:27 -08:00
Miguel Martin	b5309683cb	Cosmos Predict2.5 Base: inference pipeline, scheduler & chkpt conversion (#12852 ) * cosmos predict2.5 base: convert chkpt & pipeline - New scheduler: scheduling_flow_unipc_multistep.py - Changes to TransformerCosmos for text embeddings via crossattn_proj * scheduler cleanup * simplify inference pipeline * cleanup scheduler + tests * Basic tests for flow unipc * working b2b inference * Rename everything * Tests for pipeline present, but not working (predict2 also not working) * docstring update * wrapper pipelines + make style * remove unnecessary files * UniPCMultistep: support use_karras_sigmas=True and use_flow_sigmas=True * use UniPCMultistepScheduler + fix tests for pipeline * Remove FlowUniPCMultistepScheduler * UniPCMultistepScheduler for use_flow_sigmas=True & use_karras_sigmas=True * num_inference_steps=36 due to bug in scheduler used by predict2.5 * Address comments * make style + make fix-copies * fix tests + remove references to old pipelines * address comments * add revision in from_pretrained call * fix tests	2025-12-19 05:38:18 +05:30
Wang, Yi	87f7d11143	extend TorchAoTest::test_model_memory_usage to other platform (#12768 ) * extend TorchAoTest::test_model_memory_usage to other platform Signe-off-by: Wang, Yi <yi.a.wang@inel.com> * add some comments Signed-off-by: Wang, Yi <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi <yi.a.wang@intel.com>	2025-12-17 13:44:08 +05:30
junqiangwu	a748a839ad	Add support for LongCat-Image (#12828 ) * Add LongCat-Image * Update src/diffusers/models/transformers/transformer_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * fix code * add doc * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image_edit.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image_edit.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * fix code & mask style & fix-copies * Apply style fixes * fix single input rewrite error --------- Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: hadoop-imagen <hadoop-imagen@psxfb7pxrbvmh3oq-worker-0.psxfb7pxrbvmh3oq.hadoop-aipnlp.svc.cluster.local>	2025-12-15 07:45:17 -10:00
Wang, Yi	0c1ccc0775	fix pytest tests/pipelines/pixart_sigma/test_pixart.py::PixArtSigmaPi… (#12842 ) fix pytest tests/pipelines/pixart_sigma/test_pixart.py::PixArtSigmaPipelineIntegrationTests::test_pixart_512 in xpu Signed-off-by: Wang, Yi <yi.a.wang@intel.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-12-15 14:36:01 +05:30
Dhruv Nair	be3c2a0667	[WIP] Add Flux2 modular (#12763 ) * update * update * update * update * update * update * update * update * update * update	2025-12-10 12:19:07 +05:30
Sayak Paul	8b4722de57	Fix Qwen Edit Plus modular for multi-image input (#12601 ) * try to fix qwen edit plus multi images (modular) * up * up * test * up * up	2025-12-09 10:08:30 -10:00
CalamitousFelicitousness	2246d2c7c4	Add ZImageImg2ImgPipeline (#12751 ) * Add ZImageImg2ImgPipeline Updated the pipeline structure to include ZImageImg2ImgPipeline alongside ZImagePipeline. Implemented the ZImageImg2ImgPipeline class for image-to-image transformations, including necessary methods for encoding prompts, preparing latents, and denoising. Enhanced the auto_pipeline to map the new ZImageImg2ImgPipeline for image generation tasks. Added unit tests for ZImageImg2ImgPipeline to ensure functionality and performance. Updated dummy objects to include ZImageImg2ImgPipeline for testing purposes. * Address review comments for ZImageImg2ImgPipeline - Add `# Copied from` annotations to encode_prompt and _encode_prompt - Add ZImagePipeline to auto_pipeline.py for AutoPipeline support * Add ZImage pipeline documentation --------- Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>	2025-12-07 22:06:23 -10:00
Tran Thanh Luan	6290fdfda4	[Feat] TaylorSeer Cache (#12648 ) * init taylor_seer cache * make compatible with any tuple size returned * use logger for printing, add warmup feature * still update in warmup steps * refractor, add docs * add configurable cache, skip compute module * allow special cache ids only * add stop_predicts (cooldown) * update docs * apply ruff * update to handle multple calls per timestep * refractor to use state manager * fix format & doc * chores: naming, remove redundancy * add docs * quality & style * fix taylor precision * Apply style fixes * add tests * Apply style fixes * Remove TaylorSeerCacheTesterMixin from flux2 tests * rename identifiers, use more expressive taylor predict loop * torch compile compatible * Apply style fixes * Update src/diffusers/hooks/taylorseer_cache.py Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * update docs * make fix-copies * fix example usage. * remove tests on flux kontext --------- Co-authored-by: toilaluan <toilaluan@github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-12-06 05:39:54 +05:30
swappy	f12d161d67	Fix broken group offloading with block_level for models with standalone layers (#12692 ) * fix: group offloading to support standalone computational layers in block-level offloading * test: for models with standalone and deeply nested layers in block-level offloading * feat: support for block-level offloading in group offloading config * fix: group offload block modules to AutoencoderKL and AutoencoderKLWan * fix: update group offloading tests to use AutoencoderKL and adjust input dimensions * refactor: streamline block offloading logic * Apply style fixes * update tests * update * fix for failing tests * clean up * revert to use skip_keys * clean up --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>	2025-12-05 18:54:05 +05:30
Sayak Paul	a1f36ee3ef	[Z-Image] various small changes, Z-Image transformer tests, etc. (#12741 ) * start zimage model tests. * up * up * up * up * up * up * up * up * up * up * up * up * Revert "up" This reverts commit `bca3e27c96`. * expand upon compilation failure reason. * Update tests/models/transformers/test_models_transformer_z_image.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * reinitialize the padding tokens to ones to prevent NaN problems. * updates * up * skipping ZImage DiT tests * up * up --------- Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>	2025-12-03 19:35:46 +05:30
Sayak Paul	d96cbacacd	[tests] fix hunuyanvideo 1.5 offloading tests. (#12782 ) fix hunuyanvideo 1.5 offloading tests.	2025-12-03 18:07:59 +05:30
Aditya Borate	5ab5946931	Fix: leaf_level offloading breaks after delete_adapters (#12639 ) * Fix(peft): Re-apply group offloading after deleting adapters * Test: Add regression test for group offloading + delete_adapters * Test: Add assertions to verify output changes after deletion * Test: Add try/finally to clean up group offloading hooks --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-12-03 17:39:11 +05:30
Lev Novitskiy	d0c54e5563	Kandinsky 5.0 Video Pro and Image Lite (#12664 ) * add transformer pipeline first version --------- Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com> Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: Charles <charles@huggingface.co> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: dmitrienkoae <dmitrienko.ae@phystech.edu> Co-authored-by: nvvaulin <nvvaulin@gmail.com>	2025-12-03 00:46:37 -10:00
Kimbing Ng	3c05b9f71c	Fixes #12673 . `record_stream` in group offloading is not working properly (#12721 ) * Fixes #12673. Wrong default_stream is used. leading to wrong execution order when record_steram is enabled. * update * Update test --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-12-03 11:37:11 +05:30
Guo-Hua Wang	4f136f842c	Add support for Ovis-Image (#12740 ) * add ovis_image * fix code quality * optimize pipeline_ovis_image.py according to the feedbacks * optimize imports * add docs * make style * make style * add ovis to toctree * oops --------- Co-authored-by: YiYi Xu <yixu310@gmail.com>	2025-12-02 11:48:07 -10:00
CalamitousFelicitousness	edf36f5128	Add ZImage LoRA support and integrate into ZImagePipeline (#12750 ) * Add ZImage LoRA support and integrate into ZImagePipeline * Add LoRA test for Z-Image * Move the LoRA test * Fix ZImage LoRA scale support and test configuration * Add ZImage LoRA test overrides for architecture differences - Override test_lora_fuse_nan to use ZImage's 'layers' attribute instead of 'transformer_blocks' - Skip block-level LoRA scaling test (not supported in ZImage) - Add required imports: numpy, torch_device, check_if_lora_correctly_set * Add ZImageLoraLoaderMixin to LoRA documentation * Use conditional import for peft.LoraConfig in ZImage tests * Override test_correct_lora_configs_with_different_ranks for ZImage ZImage uses 'attention.to_k' naming convention instead of 'attn.to_k', so the base test's module name search loop never finds a match. This override uses the correct naming pattern for ZImage architecture. * Add is_flaky decorator to ZImage LoRA tests initialise padding tokens * Skip ZImage LoRA test class entirely Skip the entire ZImageLoRATests class due to non-deterministic behavior from complex64 RoPE operations and torch.empty padding tokens. LoRA functionality works correctly with real models. Clean up removed: - Individual @unittest.skip decorators - @is_flaky decorator overrides for inherited methods - Custom test method overrides - Global torch deterministic settings - Unused imports (numpy, is_flaky, check_if_lora_correctly_set) --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>	2025-12-02 02:16:30 -03:00
YiYi Xu	6156cf8f22	Hunyuanvideo15 (#12696 ) * add --------- Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal> Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-11-30 20:27:59 -10:00
Dhruv Nair	b010a8ce0c	[Modular] Add single file support to Modular (#12383 ) * update * update * update * update * Apply style fixes * update * update * update * update * update --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-11-28 22:23:04 +05:30
Jerry Wu	e6d4612309	Support unittest for Z-image ⚡️ (#12715 ) * Add Support for Z-Image. * Reformatting with make style, black & isort. * Remove init, Modify import utils, Merge forward in transformers block, Remove once func in pipeline. * modified main model forward, freqs_cis left * refactored to add B dim * fixed stack issue * fixed modulation bug * fixed modulation bug * fix bug * remove value_from_time_aware_config * styling * Fix neg embed and devide / bug; Reuse pad zero tensor; Turn cat -> repeat; Add hint for attn processor. * Replace padding with pad_sequence; Add gradient checkpointing. * Fix flash_attn3 in dispatch attn backend by _flash_attn_forward, replace its origin implement; Add DocString in pipeline for that. * Fix Docstring and Make Style. * Revert "Fix flash_attn3 in dispatch attn backend by _flash_attn_forward, replace its origin implement; Add DocString in pipeline for that." This reverts commit `fbf26b7ed1`. * update z-image docstring * Revert attention dispatcher * update z-image docstring * styling * Recover attention_dispatch.py with its origin impl, later would special commit for fa3 compatibility. * Fix prev bug, and support for prompt_embeds pass in args after prompt pre-encode as List of torch Tensor. * Remove einop dependency. * remove redundant imports & make fix-copies * fix import * Support for num_images_per_prompt>1; Remove redundant unquote variables. * Fix bugs for num_images_per_prompt with actual batch. * Add unit tests for Z-Image. * Refine unitest and skip for cases needed separate test env; Fix compatibility with unitest in model, mostly precision formating. * Add clean env for test_save_load_float16 separ test; Add Note; Styling. * Update dtype mentioned by yiyi. --------- Co-authored-by: liudongyang <liudongyang0114@gmail.com>	2025-11-26 07:18:57 -10:00
Sayak Paul	b91e8c0d0b	[lora]: Fix Flux2 LoRA NaN test (#12714 ) * up * Update tests/lora/test_lora_layers_flux2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> --------- Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>	2025-11-26 09:07:48 +05:30
Sayak Paul	5ffb73d4ae	let's go Flux2 🚀 (#12711 ) * add vae * Initial commit for Flux 2 Transformer implementation * add pipeline part * small edits to the pipeline and conversion * update conversion script * fix * up up * finish pipeline * Remove Flux IP Adapter logic for now * Remove deprecated 3D id logic * Remove ControlNet logic for now * Add link to ViT-22B paper as reference for parallel transformer blocks such as the Flux 2 single stream block * update pipeline * Don't use biases for input projs and output AdaNorm * up * Remove bias for double stream block text QKV projections * Add script to convert Flux 2 transformer to diffusers * make style and make quality * fix a few things. * allow sft files to go. * fix image processor * fix batch * style a bit * Fix some bugs in Flux 2 transformer implementation * Fix dummy input preparation and fix some test bugs * fix dtype casting in timestep guidance module. * resolve conflicts., * remove ip adapter stuff. * Fix Flux 2 transformer consistency test * Fix bug in Flux2TransformerBlock (double stream block) * Get remaining Flux 2 transformer tests passing * make style; make quality; make fix-copies * remove stuff. * fix type annotaton. * remove unneeded stuff from tests * tests * up * up * add sf support * Remove unused IP Adapter and ControlNet logic from transformer (#9) * copied from * Apply suggestions from code review Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: apolinário <joaopaulo.passos@gmail.com> * up * up * up * up * up * Refactor Flux2Attention into separate classes for double stream and single stream attention * Add _supports_qkv_fusion to AttentionModuleMixin to allow subclasses to disable QKV fusion * Have Flux2ParallelSelfAttention inherit from AttentionModuleMixin with _supports_qkv_fusion=False * Log debug message when calling fuse_projections on a AttentionModuleMixin subclass that does not support QKV fusion * Address review comments * Update src/diffusers/pipelines/flux2/pipeline_flux2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * up * Remove maybe_allow_in_graph decorators for Flux 2 transformer blocks (#12) * up * support ostris loras. (#13) * up * update schdule * up * up (#17) * add training scripts (#16) * add training scripts Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com> * model cpu offload in validation. * add flux.2 readme * add img2img and tests * cpu offload in log validation * Apply suggestions from code review * fix * up * fixes * remove i2i training tests for now. --------- Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com> Co-authored-by: linoytsaban <linoy@huggingface.co> * up --------- Co-authored-by: yiyixuxu <yixu310@gmail.com> Co-authored-by: Daniel Gu <dgu8957@gmail.com> Co-authored-by: yiyi@huggingface.co <yiyi@ip-10-53-87-203.ec2.internal> Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by: apolinário <joaopaulo.passos@gmail.com> Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal> Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com> Co-authored-by: linoytsaban <linoy@huggingface.co>	2025-11-25 21:49:04 +05:30
Jerry Wu	4088e8a851	Add Support for Z-Image Series (#12703 ) * Add Support for Z-Image. * Reformatting with make style, black & isort. * Remove init, Modify import utils, Merge forward in transformers block, Remove once func in pipeline. * modified main model forward, freqs_cis left * refactored to add B dim * fixed stack issue * fixed modulation bug * fixed modulation bug * fix bug * remove value_from_time_aware_config * styling * Fix neg embed and devide / bug; Reuse pad zero tensor; Turn cat -> repeat; Add hint for attn processor. * Replace padding with pad_sequence; Add gradient checkpointing. * Fix flash_attn3 in dispatch attn backend by _flash_attn_forward, replace its origin implement; Add DocString in pipeline for that. * Fix Docstring and Make Style. * Revert "Fix flash_attn3 in dispatch attn backend by _flash_attn_forward, replace its origin implement; Add DocString in pipeline for that." This reverts commit `fbf26b7ed1`. * update z-image docstring * Revert attention dispatcher * update z-image docstring * styling * Recover attention_dispatch.py with its origin impl, later would special commit for fa3 compatibility. * Fix prev bug, and support for prompt_embeds pass in args after prompt pre-encode as List of torch Tensor. * Remove einop dependency. * remove redundant imports & make fix-copies * fix import --------- Co-authored-by: liudongyang <liudongyang0114@gmail.com>	2025-11-25 05:50:00 -10:00
Sayak Paul	d176f61fcf	[core] support sage attention + FA2 through `kernels` (#12439 ) * up * support automatic dispatch. * disable compile support for now./ * up * flash too. * document. * up * up * up * up	2025-11-24 16:58:07 +05:30
Dhruv Nair	a96b145304	[CI] Fix failing Pipeline CPU tests (#12681 ) update Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-11-19 21:19:24 +05:30
Sayak Paul	ab71f3c864	[core] Refactor hub attn kernels (#12475 ) * refactor how attention kernels from hub are used. * up * refactor according to Dhruv's ideas. Co-authored-by: Dhruv Nair <dhruv@huggingface.co> * empty Co-authored-by: Dhruv Nair <dhruv@huggingface.co> * empty Co-authored-by: Dhruv Nair <dhruv@huggingface.co> * empty Co-authored-by: dn6 <dhruv@huggingface.co> * up --------- Co-authored-by: Dhruv Nair <dhruv@huggingface.co> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>	2025-11-19 08:19:00 +05:30
Junsong Chen	1afc21855e	SANA-Video Image to Video pipeline `SanaImageToVideoPipeline` support (#12634 ) * move sana-video to a new dir and add `SanaImageToVideoPipeline` with no modify; * fix bug and run text/image-to-vidoe success; * make style; quality; fix-copies; * add sana image-to-video pipeline in markdown; * add test case for sana image-to-video; * make style; * add a init file in sana-video test dir; * Update src/diffusers/pipelines/sana_video/pipeline_sana_video_i2v.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update tests/pipelines/sana_video/test_sana_video_i2v.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/sana_video/pipeline_sana_video_i2v.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/sana_video/pipeline_sana_video_i2v.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update tests/pipelines/sana_video/test_sana_video_i2v.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * minor update; * fix bug and skip fp16 save test; Co-authored-by: Yuyang Zhao <43061147+HeliosZhao@users.noreply.github.com> * Update src/diffusers/pipelines/sana_video/pipeline_sana_video_i2v.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/sana_video/pipeline_sana_video_i2v.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/sana_video/pipeline_sana_video_i2v.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/sana_video/pipeline_sana_video_i2v.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * add copied from for `encode_prompt` * Apply style fixes --------- Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> Co-authored-by: Yuyang Zhao <43061147+HeliosZhao@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-11-17 00:23:34 -08:00
Sayak Paul	cd3bbe2910	skip autoencoderdl layerwise casting memory (#12647 )	2025-11-13 12:56:22 +05:30
kaixuanliu	7a001c3ee2	adjust unit tests for `test_save_load_float16` (#12500 ) * adjust unit tests for wan pipeline Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update code Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * avoid adjusting common `get_dummy_components` API Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * use `form_pretrained` to `transformer` and `transformer_2` Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update code Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> --------- Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>	2025-11-13 11:57:12 +05:30
dg845	d8e4805816	[WIP]Add Wan2.2 Animate Pipeline (Continuation of #12442 by tolgacangoz) (#12526 ) --------- Co-authored-by: Tolga Cangöz <mtcangoz@gmail.com> Co-authored-by: Tolga Cangöz <46008593+tolgacangoz@users.noreply.github.com>	2025-11-12 16:52:31 -10:00
Sayak Paul	f5e5f34823	[modular] add tests for qwen modular (#12585 ) * add tests for qwenimage modular. * qwenimage edit. * qwenimage edit plus. * empty * align with the latest structure * up * up * reason * up * fix multiple issues. * up * up * fix * up * make it similar to the original pipeline.	2025-11-12 17:37:42 +05:30
Dhruv Nair	66e6a0215f	[CI] Remove unittest dependency from `testing_utils.py` (#12621 ) * update * Update tests/testing_utils.py Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update tests/testing_utils.py Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Apply style fixes --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-11-11 16:40:39 +05:30
Jay Wu	04f9d2bf3d	add ChronoEdit (#12593 ) * add ChronoEdit * add ref to original function & remove wan2.2 logics * Update src/diffusers/pipelines/chronoedit/pipeline_chronoedit.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/chronoedit/pipeline_chronoedit.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * add ChronoeEdit test * add docs * add docs * make fix-copies * fix chronoedit test --------- Co-authored-by: wjay <wjay@nvidia.com> Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-11-09 22:07:00 -08:00
Dhruv Nair	8ac17cd2cb	[Modular] Some clean up for Modular tests (#12579 ) * update * update --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-11-07 08:19:15 +05:30

1 2 3 4 5 ...

1647 Commits