diffusers

mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00

Author	SHA1	Message	Date
RuoyiDu	f6b6a7181e	Add z-image-omni-base implementation (#12857 ) * Add z-image-omni-base implementation * Merged into one transformer for Z-Image. * Fix bugs for controlnet after merging the main branch new feature. * Fix for auto_pipeline, Add Styling. * Refactor noise handling and modulation - Add select_per_token function for per-token value selection - Separate adaptive modulation logic - Cleanify t_noisy/clean variable naming - Move image_noise_mask handler from forward to pipeline * Styling & Formatting. * Rewrite code with more non-forward func & clean forward. 1.Change to one forward with shorter code with omni code (None). 2.Split out non-forward funcs: _build_unified_sequence, _prepare_sequence, patchify, pad. * Styling & Formatting. * Manual check fix-copies in controlnet, Add select_per_token, _patchify_image, _pad_with_ids; Styling. * Add Import in pipeline __init__.py. --------- Co-authored-by: Jerry Qilong Wu <xinglong.wql@alibaba-inc.com> Co-authored-by: YiYi Xu <yixu310@gmail.com>	2025-12-23 23:45:35 -10:00
Alvaro Bartolome	52766e6a69	Use `T5Tokenizer` instead of `MT5Tokenizer` (removed in Transformers v5.0+) (#12877 ) Use `T5Tokenizer` instead of `MT5Tokenizer` Given that the `MT5Tokenizer` in `transformers` is just a "re-export" of `T5Tokenizer` as per https://github.com/huggingface/transformers/blob/v4.57.3/src/transformers/models/mt5/tokenization_mt5.py )on latest available stable Transformers i.e., v4.57.3), this commit updates the imports to point to `T5Tokenizer` instead, so that those still work with Transformers v5.0.0rc0 onwards.	2025-12-23 06:57:41 -10:00
Miguel Martin	973a077c6a	Cosmos Predict2.5 14b Conversion (#12863 ) 14b conversion	2025-12-22 08:02:06 -10:00
Alvaro Bartolome	0c4f6c9cff	Add `OvisImagePipeline` in `AUTO_TEXT2IMAGE_PIPELINES_MAPPING` (#12876 )	2025-12-22 07:14:03 -10:00
MatrixTeam-AI	262ce19bff	Feature: Add Mambo-G Guidance as Guider (#12862 ) * Feature: Add Mambo-G Guidance to Qwen-Image Pipeline * change to guider implementation * fix copied code residual * Update src/diffusers/guiders/magnitude_aware_guidance.py * Apply style fixes --------- Co-authored-by: Pscgylotti <pscgylotti@github.com> Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-12-19 13:10:40 -10:00
YiYi Xu	f7753b1bc8	more update in modular (#12560 ) * move node registry to mellon * up * fix * modula rpipeline update: filter out none for input_names, fix default blocks for pipe.init() and allow user pass additional kwargs_type in a dict * qwen modular refactor, unpack before decode * update mellon node config, adding* to required_inputs and required_model_inputs * modularpipeline.from_pretrained: error out if no config found * add a component_names property to modular blocks to be consistent! * flux image_encoder -> vae_encoder * controlnet_bundle * refator MellonNodeConfig MellonPipelineConfig * refactor & simplify mellon utils * vae_image_encoder -> vae_encoder * mellon config save keep key order * style + copies * add kwargs input for zimage	2025-12-18 19:25:20 -10:00
Miguel Martin	b5309683cb	Cosmos Predict2.5 Base: inference pipeline, scheduler & chkpt conversion (#12852 ) * cosmos predict2.5 base: convert chkpt & pipeline - New scheduler: scheduling_flow_unipc_multistep.py - Changes to TransformerCosmos for text embeddings via crossattn_proj * scheduler cleanup * simplify inference pipeline * cleanup scheduler + tests * Basic tests for flow unipc * working b2b inference * Rename everything * Tests for pipeline present, but not working (predict2 also not working) * docstring update * wrapper pipelines + make style * remove unnecessary files * UniPCMultistep: support use_karras_sigmas=True and use_flow_sigmas=True * use UniPCMultistepScheduler + fix tests for pipeline * Remove FlowUniPCMultistepScheduler * UniPCMultistepScheduler for use_flow_sigmas=True & use_karras_sigmas=True * num_inference_steps=36 due to bug in scheduler used by predict2.5 * Address comments * make style + make fix-copies * fix tests + remove references to old pipelines * address comments * add revision in from_pretrained call * fix tests	2025-12-19 05:38:18 +05:30
hlky	55463f7ace	Z-Image-Turbo ControlNet (#12792 ) * init --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-12-17 09:44:20 -10:00
naykun	f9c1e612fb	Qwen Image Layered Support (#12853 ) * [qwen-image] qwen image layered support * [qwen-image] update doc * [qwen-image] fix pr comments * Apply style fixes * make fix-copies --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-12-17 16:57:57 +05:30
Wang, Yi	87f7d11143	extend TorchAoTest::test_model_memory_usage to other platform (#12768 ) * extend TorchAoTest::test_model_memory_usage to other platform Signe-off-by: Wang, Yi <yi.a.wang@inel.com> * add some comments Signed-off-by: Wang, Yi <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi <yi.a.wang@intel.com>	2025-12-17 13:44:08 +05:30
junqiangwu	5e48f466b9	fix the prefix_token_len bug (#12845 )	2025-12-15 22:02:25 -10:00
junqiangwu	a748a839ad	Add support for LongCat-Image (#12828 ) * Add LongCat-Image * Update src/diffusers/models/transformers/transformer_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * fix code * add doc * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image_edit.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image_edit.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/longcat_image/pipeline_longcat_image.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * fix code & mask style & fix-copies * Apply style fixes * fix single input rewrite error --------- Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: hadoop-imagen <hadoop-imagen@psxfb7pxrbvmh3oq-worker-0.psxfb7pxrbvmh3oq.hadoop-aipnlp.svc.cluster.local>	2025-12-15 07:45:17 -10:00
Yuqian Hong	58519283e7	Support for control-lora (#10686 ) * run control-lora on diffusers * cannot load lora adapter * test * 1 * add control-lora * 1 * 1 * 1 * fix PeftAdapterMixin * fix module_to_save bug * delete json print * resolve conflits * merged but bug * change peft.py * 1 * delete state_dict print * fix alpha * Create control_lora.py * Add files via upload * rename * no need modify as peft updated * add doc * fix code style * styling isn't that hard 😉 * empty --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-12-15 15:52:42 +05:30
Wang, Yi	0c1ccc0775	fix pytest tests/pipelines/pixart_sigma/test_pixart.py::PixArtSigmaPi… (#12842 ) fix pytest tests/pipelines/pixart_sigma/test_pixart.py::PixArtSigmaPipelineIntegrationTests::test_pixart_512 in xpu Signed-off-by: Wang, Yi <yi.a.wang@intel.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-12-15 14:36:01 +05:30
naykun	b8a4cbac14	[qwen-image] edit 2511 support (#12839 ) * [qwen-image] edit 2511 support * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-12-15 12:35:01 +05:30
Wang, Yi	17c0e79dbd	support CP in native flash attention (#12829 ) Signed-off-by: Wang, Yi <yi.a.wang@intel.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-12-12 13:18:39 +05:30
Sayak Paul	1567243463	[lora] Remove lora docs unneeded and add " # Copied from ..." (#12824 ) * remove unneeded docs on load_lora_weights(). * remove more. * up[ * up * up	2025-12-12 08:31:27 +05:30
Sayak Paul	0eac64c7a6	Update distributed_inference.md to correct syntax (#12827 )	2025-12-11 08:46:43 -08:00
Sayak Paul	10e820a2dd	post release 0.36.0 (#12804 ) * post release 0.36.0 * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-12-11 22:01:59 +05:30
Sayak Paul	6708f5c76d	[docs] improve distributed inference cp docs. (#12810 ) * improve distributed inference cp docs. * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-12-10 08:25:07 -08:00
Dhruv Nair	be3c2a0667	[WIP] Add Flux2 modular (#12763 ) * update * update * update * update * update * update * update * update * update * update	2025-12-10 12:19:07 +05:30
Sayak Paul	8b4722de57	Fix Qwen Edit Plus modular for multi-image input (#12601 ) * try to fix qwen edit plus multi images (modular) * up * up * test * up * up	2025-12-09 10:08:30 -10:00
YiYi Xu	07ea0786e8	[Modular]z-image (#12808 ) * initiL * up up * fix: z_image -> z-image * style * copy * fix more * some docstring fix	2025-12-09 08:08:41 -10:00
David El Malih	54fa0745c3	Improve docstrings and type hints in scheduling_dpmsolver_singlestep.py (#12798 ) feat: add flow sigmas, dynamic shifting, and refine type hints in DPMSolverSinglestepScheduler	2025-12-08 08:58:57 -08:00
David Lacalle Castillo	3d02cd543e	[PRX] Improve model compilation (#12787 ) * Reimplement img2seq & seq2img in PRX to enable ONNX build without Col2Im (incompatible with TensorRT). * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-12-08 17:42:17 +05:30
CalamitousFelicitousness	2246d2c7c4	Add ZImageImg2ImgPipeline (#12751 ) * Add ZImageImg2ImgPipeline Updated the pipeline structure to include ZImageImg2ImgPipeline alongside ZImagePipeline. Implemented the ZImageImg2ImgPipeline class for image-to-image transformations, including necessary methods for encoding prompts, preparing latents, and denoising. Enhanced the auto_pipeline to map the new ZImageImg2ImgPipeline for image generation tasks. Added unit tests for ZImageImg2ImgPipeline to ensure functionality and performance. Updated dummy objects to include ZImageImg2ImgPipeline for testing purposes. * Address review comments for ZImageImg2ImgPipeline - Add `# Copied from` annotations to encode_prompt and _encode_prompt - Add ZImagePipeline to auto_pipeline.py for AutoPipeline support * Add ZImage pipeline documentation --------- Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>	2025-12-07 22:06:23 -10:00
YiYi Xu	671149e036	[HunyuanVideo1.5] support step-distilled (#12802 ) * support step-distilled * style	2025-12-07 21:50:36 -10:00
jiqing-feng	f67639b0bb	add post init for safty checker (#12794 ) * add post init for safty checker Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * check transformers version before post init Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Apply style fixes --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-12-08 11:31:03 +05:30
jingyu-ml	5a74319715	Update the TensorRT-ModelOPT to Nvidia-ModelOPT (#12793 ) Update the naming Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-12-08 10:07:04 +05:30
Tran Thanh Luan	6290fdfda4	[Feat] TaylorSeer Cache (#12648 ) * init taylor_seer cache * make compatible with any tuple size returned * use logger for printing, add warmup feature * still update in warmup steps * refractor, add docs * add configurable cache, skip compute module * allow special cache ids only * add stop_predicts (cooldown) * update docs * apply ruff * update to handle multple calls per timestep * refractor to use state manager * fix format & doc * chores: naming, remove redundancy * add docs * quality & style * fix taylor precision * Apply style fixes * add tests * Apply style fixes * Remove TaylorSeerCacheTesterMixin from flux2 tests * rename identifiers, use more expressive taylor predict loop * torch compile compatible * Apply style fixes * Update src/diffusers/hooks/taylorseer_cache.py Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * update docs * make fix-copies * fix example usage. * remove tests on flux kontext --------- Co-authored-by: toilaluan <toilaluan@github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-12-06 05:39:54 +05:30
David El Malih	256e010674	Improve docstrings and type hints in scheduling_deis_multistep.py (#12796 ) * feat: Add `flow_prediction` to `prediction_type`, introduce `use_flow_sigmas`, `flow_shift`, `use_dynamic_shifting`, and `time_shift_type` parameters, and refine type hints for various arguments. * style: reformat argument wrapping in `_convert_to_beta` and `index_for_timestep` method signatures.	2025-12-05 08:48:01 -08:00
Sayak Paul	8430ac2a2f	[docs] minor fixes to kandinsky docs (#12797 ) up	2025-12-05 08:33:05 -08:00
sayakpaul	bb9e713d02	move kandisnky docs.	2025-12-05 21:44:24 +07:00
Álvaro Somoza	c98c157a9e	[Docs] Add Z-Image docs (#12775 ) * initial * toctree * fix * apply review and fix * Update docs/source/en/api/pipelines/z_image.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/api/pipelines/z_image.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/api/pipelines/z_image.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-12-05 11:05:47 -03:00
swappy	f12d161d67	Fix broken group offloading with block_level for models with standalone layers (#12692 ) * fix: group offloading to support standalone computational layers in block-level offloading * test: for models with standalone and deeply nested layers in block-level offloading * feat: support for block-level offloading in group offloading config * fix: group offload block modules to AutoencoderKL and AutoencoderKLWan * fix: update group offloading tests to use AutoencoderKL and adjust input dimensions * refactor: streamline block offloading logic * Apply style fixes * update tests * update * fix for failing tests * clean up * revert to use skip_keys * clean up --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>	2025-12-05 18:54:05 +05:30
David Bertoin	8d415a6f48	PRX Set downscale_freq_shift to 0 for consistency with internal implementation (#12791 ) fix timestepembeddings downscale_freq_shift to be consitant with Photoroom's original code	2025-12-04 10:57:14 -10:00
Sayak Paul	7de51b826c	[lora] support more ZImage LoRAs (#12790 ) up Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>	2025-12-04 09:01:11 -10:00
Jiang	cd00ba685b	fix spatial compression ratio error for AutoEncoderKLWan doing tiled encode (#12753 ) fix spatial compression ratio compute error for AutoEncoderKLWan Co-authored-by: lirui.926 <lirui.926@bytedance.com>	2025-12-04 08:57:13 -10:00
David El Malih	2842c14c5f	Improve docstrings and type hints in scheduling_unipc_multistep.py (#12767 ) refactor: add type hints and update docstrings for UniPCMultistepScheduler parameters and methods.	2025-12-04 10:10:54 -08:00
Sayak Paul	c318686090	Update attention_backends.md to format kernels (#12757 )	2025-12-04 07:48:23 -08:00
hlky	6028613226	Z-Image-Turbo `from_single_file` (#12756 ) * Z-Image-Turbo `from_single_file` * compute_dtype * -device cast	2025-12-04 20:22:48 +05:30
Sayak Paul	a1f36ee3ef	[Z-Image] various small changes, Z-Image transformer tests, etc. (#12741 ) * start zimage model tests. * up * up * up * up * up * up * up * up * up * up * up * up * Revert "up" This reverts commit `bca3e27c96`. * expand upon compilation failure reason. * Update tests/models/transformers/test_models_transformer_z_image.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * reinitialize the padding tokens to ones to prevent NaN problems. * updates * up * skipping ZImage DiT tests * up * up --------- Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>	2025-12-03 19:35:46 +05:30
Sayak Paul	d96cbacacd	[tests] fix hunuyanvideo 1.5 offloading tests. (#12782 ) fix hunuyanvideo 1.5 offloading tests.	2025-12-03 18:07:59 +05:30
Aditya Borate	5ab5946931	Fix: leaf_level offloading breaks after delete_adapters (#12639 ) * Fix(peft): Re-apply group offloading after deleting adapters * Test: Add regression test for group offloading + delete_adapters * Test: Add assertions to verify output changes after deletion * Test: Add try/finally to clean up group offloading hooks --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-12-03 17:39:11 +05:30
Lev Novitskiy	d0c54e5563	Kandinsky 5.0 Video Pro and Image Lite (#12664 ) * add transformer pipeline first version --------- Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com> Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: Charles <charles@huggingface.co> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: dmitrienkoae <dmitrienko.ae@phystech.edu> Co-authored-by: nvvaulin <nvvaulin@gmail.com>	2025-12-03 00:46:37 -10:00
Dhruv Nair	1908c47600	Deprecate `upcast_vae` in SDXL based pipelines (#12619 ) * update * update * Revert "update" This reverts commit `73906381ab`. * Revert "update" This reverts commit `21a03f93ef`. * update * update * update * update * update	2025-12-03 15:53:23 +05:30
Sayak Paul	759ea58708	[core] reuse `AttentionMixin` for compatible classes (#12463 ) * remove attn_processors property * more * up * up more. * up * add AttentionMixin to AuraFlow. * up * up * up * up	2025-12-03 13:58:33 +05:30
Sayak Paul	f48f9c250f	[core] start varlen variants for attn backend kernels. (#12765 ) * start varlen variants for attn backend kernels. * maybe unflatten heads. * updates * remove unused function. * doc * up	2025-12-03 13:34:52 +05:30
Kimbing Ng	3c05b9f71c	Fixes #12673 . `record_stream` in group offloading is not working properly (#12721 ) * Fixes #12673. Wrong default_stream is used. leading to wrong execution order when record_steram is enabled. * update * Update test --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-12-03 11:37:11 +05:30
Jerry Wu	9379b2391b	Fix TPU (torch_xla) compatibility Error about tensor repeat func along with empty dim. (#12770 ) * Refactor image padding logic to pervent zero tensor in transformer_z_image.py * Apply style fixes * Add more support to fix repeat bug on tpu devices. * Fix for dynamo compile error for multi if-branches. --------- Co-authored-by: Mingjia Li <mingjiali@tju.edu.cn> Co-authored-by: Mingjia Li <mail@mingjia.li> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-12-02 12:51:23 -10:00

1 2 3 4 5 ...

6187 Commits