diffusers

mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00

Author	SHA1	Message	Date
Daniel Gu	cbb10b8dca	Support num_videos_per_prompt for prompt embeddings	2025-12-23 07:01:17 +01:00
Daniel Gu	6e6ce20595	Duplicate scheduler for audio latents	2025-12-23 06:40:35 +01:00
Daniel Gu	54bfc5d617	Add Audio VAE logic to T2V pipeline	2025-12-23 03:51:22 +01:00
Daniel Gu	ae3b6e7cc2	Merge branch 'ltx-2-transformer' into ltx-2-t2v-pipeline	2025-12-23 02:59:33 +01:00
Daniel Gu	d303e2a6ff	Conversion script for LTX 2.0 Audio VAE Decoder	2025-12-23 02:48:15 +01:00
Daniel Gu	5f7e43d17f	Add imports for LTX 2.0 Audio VAE	2025-12-23 02:08:51 +01:00
dg845	7bb4cf76ce	Merge pull request #5 from huggingface/audio-decoder Audio decoder	2025-12-22 17:00:11 -08:00
sayakpaul	409d651bab	resolve conflicts.	2025-12-22 15:59:31 +05:30
sayakpaul	8134da6a56	up	2025-12-22 15:55:29 +05:30
Sayak Paul	059999a3f7	up	2025-12-22 10:24:55 +00:00
sayakpaul	58257eb0e0	up	2025-12-22 15:45:56 +05:30
Sayak Paul	5f0f2a03f7	up	2025-12-22 10:06:39 +00:00
Daniel Gu	d0f9cdaab1	Rough initial LTX 2.0 pipeline implementation	2025-12-22 10:07:20 +01:00
Daniel Gu	0028955c37	Initial LTX 2.0 text encoder implementation	2025-12-22 10:06:01 +01:00
sayakpaul	4904fd6fa5	up	2025-12-22 13:46:58 +05:30
sayakpaul	907896d533	simplify and clean up	2025-12-22 13:41:41 +05:30
sayakpaul	e54cd6bb1d	up	2025-12-22 13:03:40 +05:30
sayakpaul	f4c2435d61	init registration.	2025-12-22 12:25:36 +05:30
sayakpaul	b34ddb1736	start audio decoder.	2025-12-22 12:23:31 +05:30
Daniel Gu	6c56954fa8	Use RMSNorm implementation closer to original for LTX 2.0 video VAE	2025-12-20 02:40:38 +01:00
dg845	b1cf6ff8a9	Merge pull request #2 from huggingface/ltx-2-video-vae LTX 2.0 Video VAE Implementation	2025-12-19 16:36:38 -08:00
dg845	8bfeb4af56	Merge pull request #3 from huggingface/ltx-2-vocoder LTX 2.0 Vocoder Implementation	2025-12-19 16:21:31 -08:00
Daniel Gu	c6a11a5530	Initial LTX 2.0 vocoder implementation	2025-12-19 12:17:10 +01:00
Daniel Gu	a748975a7c	Get diffusers implementation on par with official LTX 2.0 video VAE implementation	2025-12-19 07:02:38 +01:00
Daniel Gu	491aae08d8	Add initial LTX 2.0 video VAE tests (part 2)	2025-12-17 11:39:09 +01:00
Daniel Gu	5b950d6fef	Add initial LTX 2.0 video VAE tests	2025-12-17 11:30:15 +01:00
Daniel Gu	baf23e2da3	Explicitly specify temporal and spatial VAE scale factors when converting	2025-12-17 11:14:45 +01:00
Daniel Gu	269cf7b40d	Initial implementation of LTX 2.0 video VAE	2025-12-17 10:51:34 +01:00
Daniel Gu	bda3ff13db	Fix LTX 2 transformer bugs so consistency test passes	2025-12-16 10:53:43 +01:00
Daniel Gu	a7bc052e89	Improve dummy inputs and add test for LTX 2 transformer consistency	2025-12-16 10:44:02 +01:00
Daniel Gu	57a8b9c330	Allow LTX 2 transformer to be loaded from local path for conversion	2025-12-16 10:38:03 +01:00
Daniel Gu	d86f89ddea	Add more LTX 2 transformer audio arguments	2025-12-16 07:58:12 +01:00
Daniel Gu	a5f2d2da6c	Initial script to convert LTX 2 transformer to diffusers	2025-12-15 07:09:42 +01:00
Daniel Gu	aeecc4d712	Fix LTX 2 transformer shape errors	2025-12-15 06:38:57 +01:00
Daniel Gu	5765759cd3	Get LTX 2 transformer compile tests passing	2025-12-15 03:38:34 +01:00
Daniel Gu	780fb61d32	Remove RoPE debug print statements	2025-12-13 10:37:24 +01:00
Daniel Gu	e100b8f2a3	Rename LTX 2 compile test class to have LTX2	2025-12-13 10:34:11 +01:00
Daniel Gu	980591de53	Get LTX 2 transformer tests working	2025-12-13 04:57:23 +01:00
Daniel Gu	b3096c3c9e	Add tests for LTX 2 transformer model	2025-12-13 04:55:41 +01:00
Daniel Gu	aa602ac483	Initial LTX 2.0 transformer implementation	2025-12-12 07:52:33 +01:00
Sayak Paul	8b4722de57	Fix Qwen Edit Plus modular for multi-image input (#12601 ) * try to fix qwen edit plus multi images (modular) * up * up * test * up * up	2025-12-09 10:08:30 -10:00
YiYi Xu	07ea0786e8	[Modular]z-image (#12808 ) * initiL * up up * fix: z_image -> z-image * style * copy * fix more * some docstring fix	2025-12-09 08:08:41 -10:00
David El Malih	54fa0745c3	Improve docstrings and type hints in scheduling_dpmsolver_singlestep.py (#12798 ) feat: add flow sigmas, dynamic shifting, and refine type hints in DPMSolverSinglestepScheduler	2025-12-08 08:58:57 -08:00
David Lacalle Castillo	3d02cd543e	[PRX] Improve model compilation (#12787 ) * Reimplement img2seq & seq2img in PRX to enable ONNX build without Col2Im (incompatible with TensorRT). * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-12-08 17:42:17 +05:30
CalamitousFelicitousness	2246d2c7c4	Add ZImageImg2ImgPipeline (#12751 ) * Add ZImageImg2ImgPipeline Updated the pipeline structure to include ZImageImg2ImgPipeline alongside ZImagePipeline. Implemented the ZImageImg2ImgPipeline class for image-to-image transformations, including necessary methods for encoding prompts, preparing latents, and denoising. Enhanced the auto_pipeline to map the new ZImageImg2ImgPipeline for image generation tasks. Added unit tests for ZImageImg2ImgPipeline to ensure functionality and performance. Updated dummy objects to include ZImageImg2ImgPipeline for testing purposes. * Address review comments for ZImageImg2ImgPipeline - Add `# Copied from` annotations to encode_prompt and _encode_prompt - Add ZImagePipeline to auto_pipeline.py for AutoPipeline support * Add ZImage pipeline documentation --------- Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>	2025-12-07 22:06:23 -10:00
YiYi Xu	671149e036	[HunyuanVideo1.5] support step-distilled (#12802 ) * support step-distilled * style	2025-12-07 21:50:36 -10:00
jiqing-feng	f67639b0bb	add post init for safty checker (#12794 ) * add post init for safty checker Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * check transformers version before post init Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Apply style fixes --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-12-08 11:31:03 +05:30
jingyu-ml	5a74319715	Update the TensorRT-ModelOPT to Nvidia-ModelOPT (#12793 ) Update the naming Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-12-08 10:07:04 +05:30
Tran Thanh Luan	6290fdfda4	[Feat] TaylorSeer Cache (#12648 ) * init taylor_seer cache * make compatible with any tuple size returned * use logger for printing, add warmup feature * still update in warmup steps * refractor, add docs * add configurable cache, skip compute module * allow special cache ids only * add stop_predicts (cooldown) * update docs * apply ruff * update to handle multple calls per timestep * refractor to use state manager * fix format & doc * chores: naming, remove redundancy * add docs * quality & style * fix taylor precision * Apply style fixes * add tests * Apply style fixes * Remove TaylorSeerCacheTesterMixin from flux2 tests * rename identifiers, use more expressive taylor predict loop * torch compile compatible * Apply style fixes * Update src/diffusers/hooks/taylorseer_cache.py Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * update docs * make fix-copies * fix example usage. * remove tests on flux kontext --------- Co-authored-by: toilaluan <toilaluan@github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-12-06 05:39:54 +05:30
David El Malih	256e010674	Improve docstrings and type hints in scheduling_deis_multistep.py (#12796 ) * feat: Add `flow_prediction` to `prediction_type`, introduce `use_flow_sigmas`, `flow_shift`, `use_dynamic_shifting`, and `time_shift_type` parameters, and refine type hints for various arguments. * style: reformat argument wrapping in `_convert_to_beta` and `index_for_timestep` method signatures.	2025-12-05 08:48:01 -08:00

1 2 3 4 5 ...

6106 Commits