diffusers

mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00

Author	SHA1	Message	Date
sayakpaul	91ee2dd26a	resolve conflicts	2026-01-07 10:12:20 +05:30
Hu Yaoqi	98479a94c2	LTX Video 0.9.8 long multi prompt (#12614 ) * LTX Video 0.9.8 long multi prompt * Further align comfyui - Added the “LTXEulerAncestralRFScheduler” scheduler, aligned with [sample_euler_ancestral_RF](`7d6103325e/comfy/k_diffusion/sampling.py (L234)`) - Updated the LTXI2VLongMultiPromptPipeline.from_pretrained() method: - Now uses LTXEulerAncestralRFScheduler by default, for better compatibility with the ComfyUI LTXV workflow. - Changed the default value of cond_strength from 1.0 to 0.5, aligning with ComfyUI’s default. - Optimized cross-window overlap blending: moved the latent-space guidance injection to before the UNet and after each step, aligned with[KSamplerX0Inpaint]([ComfyUI/comfy/samplers.py at master · comfyanonymous/ComfyUI](https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/samplers.py#L391)) - Adjusted the default value of skip_steps_sigma_threshold to 1. * align with diffusers contribute rule * Add new pipelines and update imports * Enhance LTXI2VLongMultiPromptPipeline with noise rescaling Refactor LTXI2VLongMultiPromptPipeline to improve documentation and add noise rescaling functionality. * Clean up comments in scheduling_ltx_euler_ancestral_rf.py Removed design notes and limitations from the implementation. * Enhance video generation example with scheduler Updated LTXI2VLongMultiPromptPipeline example to include LTXEulerAncestralRFScheduler for ComfyUI parity. * clean up * style * copies * import ltx scheduler * copies * fix * fix more * up up * up up up * up upup * Apply suggestions from code review * Update docs/source/en/api/pipelines/ltx_video.md * Update docs/source/en/api/pipelines/ltx_video.md --------- Co-authored-by: yiyixuxu <yixu310@gmail.com>	2026-01-06 18:18:04 -10:00
Sayak Paul	cc28cf76a7	Merge branch 'main' into ltx-2-transformer	2026-01-07 09:43:08 +05:30
Daniel Gu	d01a242cdb	make style and make quality	2026-01-06 23:54:23 +01:00
Daniel Gu	5e0cf2b2f0	Simplify LTX 2 RoPE forward by removing coords is None logic	2026-01-06 23:32:59 +01:00
zhangtao0408	ade1059ae2	[Flux.1] improve pos embed for ascend npu by computing on npu (#12897 ) * [Flux.1] improve pos embed for ascend npu by setting it back to npu computation. * [Flux.2] improve pos embed for ascend npu by setting it back to npu computation. * [LongCat-Image] improve pos embed for ascend npu by setting it back to npu computation. * [Ovis-Image] improve pos embed for ascend npu by setting it back to npu computation. * Remove unused import of is_torch_npu_available --------- Co-authored-by: zhangtao <zhangtao529@huawei.com>	2026-01-06 08:48:04 -10:00
dxqb	41a6e86faf	Check for attention mask in backends that don't support it (#12892 ) * check attention mask * Apply style fixes * bugfix --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2026-01-06 22:52:12 +05:30
Pauline Bailly-Masson	9b5a244653	CodeQL workflow for security analysis	2026-01-06 17:26:08 +01:00
Pauline Bailly-Masson	417f6b2d33	Delete .github/workflows/codeql.yml	2026-01-06 17:25:38 +01:00
Pauline Bailly-Masson	e46354d2d0	Add codeQL workflow (#12917 ) Updated CodeQL workflow to use reusable workflow from Hugging Face and simplified language matrix.	2026-01-06 17:19:48 +01:00
Sayak Paul	64b48c1729	Merge branch 'main' into ltx-2-transformer	2026-01-06 21:31:46 +05:30
sayakpaul	8c5ab1fd6d	disable ltx2_consistency test	2026-01-06 21:31:29 +05:30
sayakpaul	61e0fb4bd8	update doc entries.	2026-01-06 21:15:47 +05:30
sayakpaul	bdcf23ec17	update docs.	2026-01-06 21:02:18 +05:30
sayakpaul	c39f1b87a4	remove args.	2026-01-06 20:52:49 +05:30
sayakpaul	57ead0b5e5	remove function map.	2026-01-06 20:48:16 +05:30
Pauline Bailly-Masson	db37140474	Refactor environment variable assignments in workflow (#12916 )	2026-01-06 13:39:18 +01:00
Sayak Paul	2fc578941b	Merge branch 'main' into ltx-2-transformer	2026-01-06 13:51:36 +05:30
hlky	88ffb00139	Detect 2.0 vs 2.1 ZImageControlNetModel (#12861 ) * Detect 2.0 vs 2.1 ZImageControlNetModel * Possibility of control_noise_refiner being removed	2026-01-05 20:28:52 -10:00
Sayak Paul	b6098ca006	[core] remove unneeded autoencoder methods when subclassing from `AutoencoderMixin` (#12873 ) up	2026-01-05 19:43:54 -10:00
Sayak Paul	7c6d314549	fix the use of device_map in CP docs (#12902 ) up	2026-01-05 19:42:32 -10:00
Daniel Gu	dd81242eba	make style and make quality	2026-01-06 06:42:24 +01:00
Daniel Gu	ace2ee93fb	Allow the I2V pipeline to accept image URLs	2026-01-06 06:40:42 +01:00
Daniel Gu	ef199118e2	Point original checkpoint to LTX 2.0 official checkpoint	2026-01-06 06:35:51 +01:00
sayakpaul	550eca3530	use export util funcs.	2026-01-06 09:14:38 +05:30
sayakpaul	c039c87b99	up	2026-01-06 08:09:59 +05:30
sayakpaul	9b8788cc98	resolve conflicts.	2026-01-06 08:09:37 +05:30
Sayak Paul	93a417f24a	Tests for T2V and I2V (#6 ) * add ltx2 pipeline tests. * up * up * up * up * remove content * style * Denormalize audio latents in I2V pipeline (analogous to T2V change) * Initial refactor to put video and audio text encoder connectors in transformer * Get LTX 2 transformer tests working after connector refactor * up * up * i2v tests. * up * Address review comments * Calculate RoPE double precisions freqs using torch instead of np * Further simplify LTX 2 RoPE freq calc * revert unneded changes. * up * up * update to split style rope. * up --------- Co-authored-by: Daniel Gu <dgu8957@gmail.com>	2026-01-06 08:05:30 +05:30
dg845	ce9da5d472	Merge pull request #20 from huggingface/video-export-utils-file Add export_utils file for exporting LTX 2.0 videos with audio	2026-01-05 18:25:29 -08:00
DefTruth	3138e37fe6	Fix wan 2.1 i2v context parallel (#12909 ) * fix wan 2.1 i2v context parallel * fix wan 2.1 i2v context parallel * fix wan 2.1 i2v context parallel * format	2026-01-06 07:42:53 +05:30
Miguel Martin	0da1aa90b5	Fix typo in src/diffusers/pipelines/cosmos/pipeline_cosmos2_5_predict.py (#12914 )	2026-01-05 15:44:39 -10:00
Daniel Gu	cb50cacba5	Add export_utils file for exporting LTX 2.0 videos with audio	2026-01-06 02:17:39 +01:00
Daniel Gu	bff989110c	Fix apply split RoPE shape error when reshaping x to 4D	2026-01-06 01:22:05 +01:00
Daniel Gu	2fa4f8471f	When using split RoPE, make sure that the output dtype is same as input dtype	2026-01-06 00:19:39 +01:00
Sayak Paul	c5b52d6c9f	address initial feedback from lightricks team (#16 ) * cross_attn_timestep_scale_multiplier to 1000 * implement split rope type. * up * propagate rope_type to rope embed classes as well. * up	2026-01-05 21:13:10 +05:30
Sayak Paul	0be4f31620	up (#19 )	2026-01-05 21:13:01 +05:30
dg845	caae16768a	Move Video and Audio Text Encoder Connectors to Transformer (#12 ) * Denormalize audio latents in I2V pipeline (analogous to T2V change) * Initial refactor to put video and audio text encoder connectors in transformer * Get LTX 2 transformer tests working after connector refactor * precompute run_connectors,. * fixes * Address review comments * Calculate RoPE double precisions freqs using torch instead of np * Further simplify LTX 2 RoPE freq calc * Make connectors a separate module (#18) * remove text_encoder.py * address yiyi's comments. * up * up * up * up --------- Co-authored-by: sayakpaul <spsayakpaul@gmail.com>	2026-01-05 20:11:13 +05:30
Jefri Haryono	5ffb65803d	Community Pipeline: Add z-image differential img2img (#12882 ) * Community Pipeline: Add z-image differential img2img * add pipeline for z-image differential img2img diffusion examples : run make style , make quality, and fix white spaces in example doc string. --------- Co-authored-by: r4inm4ker <jefri.yeh@gmail.com>	2026-01-05 09:53:52 -03:00
DefTruth	d0ae34d313	chore: fix dev version in setup.py (#12904 )	2026-01-05 09:21:48 +05:30
hlky	47378066c0	Z-Image-Turbo from_single_file fix (#12888 )	2026-01-02 22:29:24 +05:30
Maxim Balabanski	208cda8f6d	fix Qwen Image Transformer single file loading mapping function to be consistent with other loader APIs (#12894 ) fix Qwen single file loading to be consistent with other loader API	2026-01-02 12:59:11 +05:30
dg845	aae70b90db	Merge pull request #10 from huggingface/make-scheduler-consistent Make LTX 2.0 Scheduler `sigmas` Consistent with Original Code	2025-12-31 13:46:47 -08:00
sayakpaul	d3f10fe54e	test i2v.	2025-12-31 09:36:48 +05:30
dg845	bd607b97a8	Denormalize audio latents in I2V pipeline (analogous to T2V change) (#11 )	2025-12-31 09:23:35 +05:30
Daniel Gu	6a236a27fb	Merge branch 'ltx-2-transformer' into make-scheduler-consistent	2025-12-30 20:25:59 +01:00
Vasiliy Kuznetsov	1cdb8723b8	fix torchao quantizer for new torchao versions (#12901 ) * fix torchao quantizer for new torchao versions Summary: `torchao==0.16.0` (not yet released) has some bc-breaking changes, this PR fixes the diffusers repo with those changes. Specifics on the changes: 1. `UInt4Tensor` is removed: https://github.com/pytorch/ao/pull/3536 2. old float8 tensors v1 are removed: https://github.com/pytorch/ao/pull/3510 In this PR: 1. move the logger variable up (not sure why it was in the middle of the file before) to get better error messages 2. gate the old torchao objects by torchao version Test Plan: import diffusers objects with new versions of torchao works: ```bash > python -c "import torchao; print(torchao.__version__); from diffusers import StableDiffusionPipeline" 0.16.0.dev20251229+cu129 ``` Reviewers: Subscribers: Tasks: Tags: * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-12-30 10:04:54 +05:30
Sayak Paul	46822c43db	Add support for I2V (#8 ) * start i2v. * up * up * up * up * up * remove uniform strategy code. * remove unneeded code.	2025-12-30 09:06:07 +05:30
Sayak Paul	280e347814	Refactor Audio VAE to be simpler and remove helpers (#7 ) * remove resolve causality axes stuff. * remove a bunch of helpers. * remove adjust output shape helper. * remove the use of audiolatentshape. * move normalization and patchify out of pipeline. * fix * up * up * Remove unpatchify and patchify ops before audio latents denormalization (#9) --------- Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>	2025-12-30 08:05:56 +05:30
Daniel Gu	e1f0b7e255	Fix typo when applying scheduler fix in T2V inference script	2025-12-30 00:38:51 +01:00
Daniel Gu	581f21c431	Make LTX 2.0 scheduler more consistent with original code	2025-12-29 23:44:52 +01:00

1 2 3 4 5 ...

6187 Commits