diffusers

mirror of https://github.com/huggingface/diffusers.git synced 2026-01-29 07:22:12 +03:00

Author	SHA1	Message	Date
Dhruv Nair	7aa6af1138	[Refactor] Move testing utils out of src (#12238 ) * update * update * update * update * update * merge main * Revert "merge main" This reverts commit `65efbcead5`.	2025-08-28 19:53:02 +05:30
Aryan	0454fbb30b	First Block Cache (#11180 ) * update * modify flux single blocks to make compatible with cache techniques (without too much model-specific intrusion code) * remove debug logs * update * cache context for different batches of data * fix hs residual bug for single return outputs; support ltx * fix controlnet flux * support flux, ltx i2v, ltx condition * update * update * Update docs/source/en/api/cache.md * Update src/diffusers/hooks/hooks.py Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * address review comments pt. 1 * address review comments pt. 2 * cache context refacotr; address review pt. 3 * address review comments * metadata registration with decorators instead of centralized * support cogvideox * support mochi * fix * remove unused function * remove central registry based on review * update --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>	2025-07-09 03:27:15 +05:30
Dhruv Nair	cbc8ced20f	[CI] Fix big GPU test marker (#11786 ) * update * update	2025-07-08 22:09:09 +05:30
Aryan	a4df8dbc40	Update more licenses to 2025 (#11746 ) update	2025-06-19 07:46:01 +05:30
Yao Matrix	2d380895e5	enable 7 cases on XPU (#11503 ) * enable 7 cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * calibrate A100 expectations Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com> Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-05-09 15:52:08 +05:30
Aryan	844221ae4e	[core] FasterCache (#10163 ) * init * update * update * update * make style * update * fix * make it work with guidance distilled models * update * make fix-copies * add tests * update * apply_faster_cache -> apply_fastercache * fix * reorder * update * refactor * update docs * add fastercache to CacheMixin * update tests * Apply suggestions from code review * make style * try to fix partial import error * Apply style fixes * raise warning * update --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-03-21 09:35:04 +05:30
Fanli Lin	7855ac597e	[tests] make tests device-agnostic (part 4) (#10508 ) * initial comit * fix empty cache * fix one more * fix style * update device functions * update * update * Update src/diffusers/utils/testing_utils.py Co-authored-by: hlky <hlky@hlky.ac> * Update src/diffusers/utils/testing_utils.py Co-authored-by: hlky <hlky@hlky.ac> * Update src/diffusers/utils/testing_utils.py Co-authored-by: hlky <hlky@hlky.ac> * Update tests/pipelines/controlnet/test_controlnet.py Co-authored-by: hlky <hlky@hlky.ac> * Update src/diffusers/utils/testing_utils.py Co-authored-by: hlky <hlky@hlky.ac> * Update src/diffusers/utils/testing_utils.py Co-authored-by: hlky <hlky@hlky.ac> * Update tests/pipelines/controlnet/test_controlnet.py Co-authored-by: hlky <hlky@hlky.ac> * with gc.collect * update * make style * check_torch_dependencies * add mps empty cache * add changes * bug fix * enable on xpu * update more cases * revert * revert back * Update test_stable_diffusion_xl.py * Update tests/pipelines/stable_diffusion/test_stable_diffusion.py Co-authored-by: hlky <hlky@hlky.ac> * Update tests/pipelines/stable_diffusion/test_stable_diffusion.py Co-authored-by: hlky <hlky@hlky.ac> * Update tests/pipelines/stable_diffusion/test_stable_diffusion_img2img.py Co-authored-by: hlky <hlky@hlky.ac> * Update tests/pipelines/stable_diffusion/test_stable_diffusion_img2img.py Co-authored-by: hlky <hlky@hlky.ac> * Update tests/pipelines/stable_diffusion/test_stable_diffusion_img2img.py Co-authored-by: hlky <hlky@hlky.ac> * Apply suggestions from code review Co-authored-by: hlky <hlky@hlky.ac> * add test marker --------- Co-authored-by: hlky <hlky@hlky.ac>	2025-03-04 08:26:06 +00:00
Aryan	9a147b82f7	Module Group Offloading (#10503 ) * update * fix * non_blocking; handle parameters and buffers * update * Group offloading with cuda stream prefetching (#10516) * cuda stream prefetch * remove breakpoints * update * copy model hook implementation from pab * update; ~very workaround based implementation but it seems to work as expected; needs cleanup and rewrite * more workarounds to make it actually work * cleanup * rewrite * update * make sure to sync current stream before overwriting with pinned params not doing so will lead to erroneous computations on the GPU and cause bad results * better check * update * remove hook implementation to not deal with merge conflict * re-add hook changes * why use more memory when less memory do trick * why still use slightly more memory when less memory do trick * optimise * add model tests * add pipeline tests * update docs * add layernorm and groupnorm * address review comments * improve tests; add docs * improve docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * apply suggestions from code review * update tests * apply suggestions from review * enable_group_offloading -> enable_group_offload for naming consistency * raise errors if multiple offloading strategies used; add relevant tests * handle .to() when group offload applied * refactor some repeated code * remove unintentional change from merge conflict * handle .cuda() --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-02-14 12:59:45 +05:30
Aryan	beacaa5528	[core] Layerwise Upcasting (#10347 ) * update * update * make style * remove dynamo disable * add coauthor Co-Authored-By: Dhruv Nair <dhruv.nair@gmail.com> * update * update * update * update mixin * add some basic tests * update * update * non_blocking * improvements * update * norm.* -> norm * apply suggestions from review * add example * update hook implementation to the latest changes from pyramid attention broadcast * deinitialize should raise an error * update doc page * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update docs * update * refactor * fix _always_upcast_modules for asym ae and vq_model * fix lumina embedding forward to not depend on weight dtype * refactor tests * add simple lora inference tests * _always_upcast_modules -> _precision_sensitive_module_patterns * remove todo comments about review; revert changes to self.dtype in unets because .dtype on ModelMixin should be able to handle fp8 weight case * check layer dtypes in lora test * fix UNet1DModelTests::test_layerwise_upcasting_inference * _precision_sensitive_module_patterns -> _skip_layerwise_casting_patterns based on feedback * skip test in NCSNppModelTests * skip tests for AutoencoderTinyTests * skip tests for AutoencoderOobleckTests * skip tests for UNet1DModelTests - unsupported pytorch operations * layerwise_upcasting -> layerwise_casting * skip tests for UNetRLModelTests; needs next pytorch release for currently unimplemented operation support * add layerwise fp8 pipeline test * use xfail * Apply suggestions from code review Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * add assertion with fp32 comparison; add tolerance to fp8-fp32 vs fp32-fp32 comparison (required for a few models' test to pass) * add note about memory consumption on tesla CI runner for failing test --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-01-22 19:49:37 +05:30
Sayak Paul	a6f043a80f	[LoRA] allow big CUDA tests to run properly for LoRA (and others) (#9845 ) * allow big lora tests to run on the CI. * print * print. * print * print * print * print * more * print * remove print. * remove print * directly place on cuda. * remove pipeline. * remove * fix * fix * spaces * quality * updates * directly place flux controlnet pipeline on cuda. * torch_device instead of cuda. * style * device placement. * fixes * add big gpu marker for mochi; rename test correctly * address feedback * fix --------- Co-authored-by: Aryan <aryan@huggingface.co>	2025-01-10 12:50:24 +05:30
Aryan	f66bd3261c	Rename Mochi integration test correctly (#10220 ) rename integration test	2024-12-18 22:41:23 +05:30
Aryan	3f329a426a	[core] Mochi T2V (#9769 ) * update * udpate * update transformer * make style * fix * add conversion script * update * fix * update * fix * update * fixes * make style * update * update * update * init * update * update * add * up * up * up * update * mochi transformer * remove original implementation * make style * update inits * update conversion script * docs * Update src/diffusers/pipelines/mochi/pipeline_mochi.py Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * Update src/diffusers/pipelines/mochi/pipeline_mochi.py Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * fix docs * pipeline fixes * make style * invert sigmas in scheduler; fix pipeline * fix pipeline num_frames * flip proj and gate in swiglu * make style * fix * make style * fix tests * latent mean and std fix * update * cherry-pick `1069d210e1` * remove additional sigma already handled by flow match scheduler * fix * remove hardcoded value * replace conv1x1 with linear * Update src/diffusers/pipelines/mochi/pipeline_mochi.py Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * framewise decoding and conv_cache * make style * Apply suggestions from code review * mochi vae encoder changes * rebase correctly * Update scripts/convert_mochi_to_diffusers.py * fix tests * fixes * make style * update * make style * update * add framewise and tiled encoding * make style * make original vae implementation behaviour the default; note: framewise encoding does not work * remove framewise encoding implementation due to presence of attn layers * fight test 1 * fight test 2 --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by: yiyixuxu <yixu310@gmail.com>	2024-11-05 20:33:41 +05:30

12 Commits