* update
* fix
* non_blocking; handle parameters and buffers
* update
* Group offloading with cuda stream prefetching (#10516)
* cuda stream prefetch
* remove breakpoints
* update
* copy model hook implementation from pab
* update; ~very workaround based implementation but it seems to work as expected; needs cleanup and rewrite
* more workarounds to make it actually work
* cleanup
* rewrite
* update
* make sure to sync current stream before overwriting with pinned params
not doing so will lead to erroneous computations on the GPU and cause bad results
* better check
* update
* remove hook implementation to not deal with merge conflict
* re-add hook changes
* why use more memory when less memory do trick
* why still use slightly more memory when less memory do trick
* optimise
* add model tests
* add pipeline tests
* update docs
* add layernorm and groupnorm
* address review comments
* improve tests; add docs
* improve docs
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* apply suggestions from code review
* update tests
* apply suggestions from review
* enable_group_offloading -> enable_group_offload for naming consistency
* raise errors if multiple offloading strategies used; add relevant tests
* handle .to() when group offload applied
* refactor some repeated code
* remove unintentional change from merge conflict
* handle .cuda()
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* update
* update
* make style
* remove dynamo disable
* add coauthor
Co-Authored-By: Dhruv Nair <dhruv.nair@gmail.com>
* update
* update
* update
* update mixin
* add some basic tests
* update
* update
* non_blocking
* improvements
* update
* norm.* -> norm
* apply suggestions from review
* add example
* update hook implementation to the latest changes from pyramid attention broadcast
* deinitialize should raise an error
* update doc page
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* update docs
* update
* refactor
* fix _always_upcast_modules for asym ae and vq_model
* fix lumina embedding forward to not depend on weight dtype
* refactor tests
* add simple lora inference tests
* _always_upcast_modules -> _precision_sensitive_module_patterns
* remove todo comments about review; revert changes to self.dtype in unets because .dtype on ModelMixin should be able to handle fp8 weight case
* check layer dtypes in lora test
* fix UNet1DModelTests::test_layerwise_upcasting_inference
* _precision_sensitive_module_patterns -> _skip_layerwise_casting_patterns based on feedback
* skip test in NCSNppModelTests
* skip tests for AutoencoderTinyTests
* skip tests for AutoencoderOobleckTests
* skip tests for UNet1DModelTests - unsupported pytorch operations
* layerwise_upcasting -> layerwise_casting
* skip tests for UNetRLModelTests; needs next pytorch release for currently unimplemented operation support
* add layerwise fp8 pipeline test
* use xfail
* Apply suggestions from code review
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
* add assertion with fp32 comparison; add tolerance to fp8-fp32 vs fp32-fp32 comparison (required for a few models' test to pass)
* add note about memory consumption on tesla CI runner for failing test
---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* remove 2 shapes from SDFunctionTesterMixin::test_vae_tiling
* combine freeu enable/disable test to reduce many inference runs
* remove low signal unet test for signature
* remove low signal embeddings test
* remove low signal progress bar test from PipelineTesterMixin
* combine ip-adapter single and multi tests to save many inferences
* fix broken tests
* Update tests/pipelines/test_pipelines_common.py
* Update tests/pipelines/test_pipelines_common.py
* add progress bar tests
* speed up test_vae_slicing in animatediff
* speed up test_karras_schedulers_shape for attend and excite.
* style.
* get the static slices out.
* specify torch print options.
* modify
* test run with controlnet
* specify kwarg
* fix: things
* not None
* flatten
* controlnet img2img
* complete controlet sd
* finish more
* finish more
* finish more
* finish more
* finish the final batch
* add cpu check for expected_pipe_slice.
* finish the rest
* remove print
* style
* fix ssd1b controlnet test
* checking ssd1b
* disable the test.
* make the test_ip_adapter_single controlnet test more robust
* fix: simple inpaint
* multi
* disable panorama
* enable again
* panorama is shaky so leave it for now
* remove print
* raise tolerance.
* Add properties and `IPAdapterTesterMixin` tests for `StableDiffusionPanoramaPipeline`
* Fix variable name typo and update comments
* Update deprecated `output_type="numpy"` to "np" in test files
* Discard changes to src/diffusers/pipelines/stable_diffusion_panorama/pipeline_stable_diffusion_panorama.py
* Update test_stable_diffusion_panorama.py
* Update numbers in README.md
* Update get_guidance_scale_embedding method to use timesteps instead of w
* Update number of checkpoints in README.md
* Add type hints and fix var name
* Fix PyTorch's convention for inplace functions
* Fix a typo
* Revert "Fix PyTorch's convention for inplace functions"
This reverts commit 74350cf65b.
* Fix typos
* Indent
* Refactor get_guidance_scale_embedding method in LEditsPPPipelineStableDiffusionXL class
* move model helper function in pipeline to EfficiencyMixin
---------
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Add custom timesteps support to LCMScheduler.
* Add custom timesteps support to StableDiffusionPipeline.
* Add custom timesteps support to StableDiffusionXLPipeline.
* Add custom timesteps support to remaining Stable Diffusion pipelines which support LCMScheduler (img2img, inpaint).
* Add custom timesteps support to remaining Stable Diffusion XL pipelines which support LCMScheduler (img2img, inpaint).
* Add custom timesteps support to StableDiffusionControlNetPipeline.
* Add custom timesteps support to T21 Stable Diffusion (XL) Adapters.
* Clean up Stable Diffusion inpaint tests.
* Manually add support for custom timesteps to AltDiffusion pipelines since make fix-copies doesn't appear to work correctly (it deletes the whole pipeline).
* make style
* Refactor pipeline timestep handling into the retrieve_timesteps function.
* fix
* fix copies
* remove heun from tests
* add back heun and fix the tests to include 2nd order
* fix the other test too
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
* make style
* add more comments
---------
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update get_dummy_inputs(...) in T2I-Adapter tests to take image height and width as params.
* Update the T2I-Adapter unit tests to run with the standard number of UNet down blocks so that all T2I-Adapter down blocks get exercised.
* Update the T2I-Adapter down blocks to better match the padding behavior of the UNet.
* Revert "Update the T2I-Adapter unit tests to run with the standard number of UNet down blocks so that all T2I-Adapter down blocks get exercised."
This reverts commit 6d4a060a34.
* Create utility functions for testing the T2I-Adapter downscaling bahevior.
* (minor) Improve readability with an intermediate named variable.
* Statically parameterize T2I-Adapter test dimensions rather than generating them dynamically.
* Fix static checks.
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* fix: sdxl pipeline when unet is not available.
* fix moe
* account for text
* ifx more
* don't make unet optional.
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* split conditionals.
* add optional components to sdxl pipeline
* propagate changes to the rest of the pipelines.
* add: test
* add to all
* fix: rest of the pipelines.
* use pipeline_class variable
* separate pipeline mixin
* use safe_serialization
* fix: test
* access actual output.
* add: optional test to adapter and ip2p sdxl pipeline tests/
* add optional test to controlnet sdxl.
* fix tests
* fix ip2p tests
* fix more
* fifx more.
* use np output type.
* fix for StableDiffusionXLMultiControlNetPipelineFastTests.
* fix: SDXLOptionalComponentsTesterMixin
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* fix tests
* Empty-Commit
* revert previous
* quality
* fix: test
---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>