diffusers

mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00

Author	SHA1	Message	Date
Sayak Paul	43459079ab	[core] feat: support group offloading at the pipeline level (#12283 ) * feat: support group offloading at the pipeline level. * add tests * up * [docs] Pipeline group offloading (#12286) init Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-09-10 09:09:57 +05:30
Steven Liu	6184d8a433	[docs] device_map (#11711 ) draft Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-06-20 10:14:48 -07:00
Steven Liu	5a6e386464	[docs] Quantization + torch.compile + offloading (#11703 ) * draft * feedback * update * feedback * fix * feedback * feedback * fix * feedback	2025-06-20 10:11:39 -07:00
Sayak Paul	85a916bb8b	make group offloading work with disk/nvme transfers (#11682 ) * start implementing disk offloading in group. * delete diff file. * updates.patch * offload_to_disk_path * check if safetensors already exist. * add test and clarify. * updates * update todos. * update more docs. * update docs	2025-06-19 18:09:30 +05:30
Aryan	a4df8dbc40	Update more licenses to 2025 (#11746 ) update	2025-06-19 07:46:01 +05:30
Sayak Paul	6918f6d19a	[docs] tip for group offloding + quantization (#11576 ) * tip for group offloding + quantization Co-authored-by: Aryan VS <contact.aryanvs@gmail.com> * Apply suggestions from code review Co-authored-by: Aryan <aryan@huggingface.co> --------- Co-authored-by: Aryan VS <contact.aryanvs@gmail.com> Co-authored-by: Aryan <aryan@huggingface.co>	2025-05-19 14:49:15 +05:30
Steven Liu	b848d479b1	[docs] Memory optims (#11385 ) * reformat * initial * fin * review * inference * feedback * feedback * feedback	2025-05-01 11:22:00 -07:00
Sayak Paul	4b27c4a494	[feat] implement `record_stream` when using CUDA streams during group offloading (#11081 ) * implement record_stream for better performance. * fix * style. * merge #11097 * Update src/diffusers/hooks/group_offloading.py Co-authored-by: Aryan <aryan@huggingface.co> * fixes * docstring. * remaining todos in low_cpu_mem_usage * tests * updates to docs. --------- Co-authored-by: Aryan <aryan@huggingface.co>	2025-04-08 21:17:49 +05:30
Aryan	1ddf3f3a19	Improve information about group offloading and layerwise casting (#11101 ) * update * Update docs/source/en/optimization/memory.md * Apply suggestions from code review Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * apply review suggestions * update --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>	2025-03-24 23:25:59 +05:30
Aryan	9a147b82f7	Module Group Offloading (#10503 ) * update * fix * non_blocking; handle parameters and buffers * update * Group offloading with cuda stream prefetching (#10516) * cuda stream prefetch * remove breakpoints * update * copy model hook implementation from pab * update; ~very workaround based implementation but it seems to work as expected; needs cleanup and rewrite * more workarounds to make it actually work * cleanup * rewrite * update * make sure to sync current stream before overwriting with pinned params not doing so will lead to erroneous computations on the GPU and cause bad results * better check * update * remove hook implementation to not deal with merge conflict * re-add hook changes * why use more memory when less memory do trick * why still use slightly more memory when less memory do trick * optimise * add model tests * add pipeline tests * update docs * add layernorm and groupnorm * address review comments * improve tests; add docs * improve docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * apply suggestions from code review * update tests * apply suggestions from review * enable_group_offloading -> enable_group_offload for naming consistency * raise errors if multiple offloading strategies used; add relevant tests * handle .to() when group offload applied * refactor some repeated code * remove unintentional change from merge conflict * handle .cuda() --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-02-14 12:59:45 +05:30
Aryan	beacaa5528	[core] Layerwise Upcasting (#10347 ) * update * update * make style * remove dynamo disable * add coauthor Co-Authored-By: Dhruv Nair <dhruv.nair@gmail.com> * update * update * update * update mixin * add some basic tests * update * update * non_blocking * improvements * update * norm.* -> norm * apply suggestions from review * add example * update hook implementation to the latest changes from pyramid attention broadcast * deinitialize should raise an error * update doc page * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update docs * update * refactor * fix _always_upcast_modules for asym ae and vq_model * fix lumina embedding forward to not depend on weight dtype * refactor tests * add simple lora inference tests * _always_upcast_modules -> _precision_sensitive_module_patterns * remove todo comments about review; revert changes to self.dtype in unets because .dtype on ModelMixin should be able to handle fp8 weight case * check layer dtypes in lora test * fix UNet1DModelTests::test_layerwise_upcasting_inference * _precision_sensitive_module_patterns -> _skip_layerwise_casting_patterns based on feedback * skip test in NCSNppModelTests * skip tests for AutoencoderTinyTests * skip tests for AutoencoderOobleckTests * skip tests for UNet1DModelTests - unsupported pytorch operations * layerwise_upcasting -> layerwise_casting * skip tests for UNetRLModelTests; needs next pytorch release for currently unimplemented operation support * add layerwise fp8 pipeline test * use xfail * Apply suggestions from code review Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * add assertion with fp32 comparison; add tolerance to fp8-fp32 vs fp32-fp32 comparison (required for a few models' test to pass) * add note about memory consumption on tesla CI runner for failing test --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-01-22 19:49:37 +05:30
suzukimain	b52119ae92	[docs] Replace runwayml/stable-diffusion-v1-5 with Lykon/dreamshaper-8 (#9428 ) * [docs] Replace runwayml/stable-diffusion-v1-5 with Lykon/dreamshaper-8 Updated documentation as runwayml/stable-diffusion-v1-5 has been removed from Huggingface. * Update docs/source/en/using-diffusers/inpaint.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Replace with stable-diffusion-v1-5/stable-diffusion-v1-5 * Update inpaint.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-09-16 10:18:45 -07:00
Mark Van Aken	be4afa0bb4	#7535 Update FloatTensor type hints to Tensor (#7883 ) * find & replace all FloatTensors to Tensor * apply formatting * Update torch.FloatTensor to torch.Tensor in the remaining files * formatting * Fix the rest of the places where FloatTensor is used as well as in documentation * formatting * Update new file from FloatTensor to Tensor	2024-05-10 09:53:31 -10:00
Sayak Paul	30e5e81d58	change to 2024 in the license (#6902 ) change to 2024	2024-02-08 08:19:31 -10:00
M. Tolga Cangöz	c697f52476	[`Docs`] Update and make improvements (#5819 ) Update and make improvements	2023-11-16 13:47:25 -08:00
M. Tolga Cangöz	53a8439fd1	[`Docs`] Fix typos and update files at Optimization Page (#5674 ) * Fix typos, update, trim trailing whitespace * Trim trailing whitespaces * Update docs/source/en/optimization/memory.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/memory.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update _toctree.yml * Update adapt_a_model.md * Reverse * Reverse * Reverse * Update dreambooth.md * Update instructpix2pix.md * Update lora.md * Update overview.md * Update t2i_adapters.md * Update text2image.md * Update text_inversion.md * Update create_dataset.md * Update create_dataset.md * Update create_dataset.md * Update create_dataset.md * Update coreml.md * Delete docs/source/en/training/create_dataset.md * Original create_dataset.md * Update create_dataset.md * Delete docs/source/en/training/create_dataset.md * Add original file * Delete docs/source/en/training/create_dataset.md * Add original one * Delete docs/source/en/training/text2image.md * Delete docs/source/en/training/instructpix2pix.md * Delete docs/source/en/training/dreambooth.md * Add original files --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-11-09 13:48:57 -08:00
Patrick von Platen	ad06e5106e	[Docs] Improve xformers page (#5196 ) [Docs] Improve	2023-09-27 16:02:15 +05:30
Steven Liu	19edca82f1	[docs] Create clearer optimization sections (#4870 ) * refactor * update general optim sections * update more sections * few more updates * benchmark code	2023-09-13 15:21:15 -07:00

18 Commits