* Cross-attention masks
prefer qualified symbol, fix accidental Optional
prefer qualified symbol in AttentionProcessor
prefer qualified symbol in embeddings.py
qualified symbol in transformed_2d
qualify FloatTensor in unet_2d_blocks
move new transformer_2d params attention_mask, encoder_attention_mask to the end of the section which is assumed (e.g. by functions such as checkpoint()) to have a stable positional param interface. regard return_dict as a special-case which is assumed to be injected separately from positional params (e.g. by create_custom_forward()).
move new encoder_attention_mask param to end of CrossAttn block interfaces and Unet2DCondition interface, to maintain positional param interface.
regenerate modeling_text_unet.py
remove unused import
unet_2d_condition encoder_attention_mask docs
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
versatile_diffusion/modeling_text_unet.py encoder_attention_mask docs
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
transformer_2d encoder_attention_mask docs
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
unet_2d_blocks.py: add parameter name comments
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
revert description. bool-to-bias treatment happens in unet_2d_condition only.
comment parameter names
fix copies, style
* encoder_attention_mask for SimpleCrossAttnDownBlock2D, SimpleCrossAttnUpBlock2D
* encoder_attention_mask for UNetMidBlock2DSimpleCrossAttn
* support attention_mask, encoder_attention_mask in KCrossAttnDownBlock2D, KCrossAttnUpBlock2D, KAttentionBlock. fix binding of attention_mask, cross_attention_kwargs params in KCrossAttnDownBlock2D, KCrossAttnUpBlock2D checkpoint invocations.
* fix mistake made during merge conflict resolution
* regenerate versatile_diffusion
* pass time embedding into checkpointed attention invocation
* always assume encoder_attention_mask is a mask (i.e. not a bias).
* style, fix-copies
* add tests for cross-attention masks
* add test for padding of attention mask
* explain mask's query_tokens dim. fix explanation about broadcasting over channels; we actually broadcast over query tokens
* support both masks and biases in Transformer2DModel#forward. document behaviour
* fix-copies
* delete attention_mask docs on the basis I never tested self-attention masking myself. not comfortable explaining it, since I don't actually understand how a self-attn mask can work in its current form: the key length will be different in every ResBlock (we don't downsample the mask when we downsample the image).
* review feedback: the standard Unet blocks shouldn't pass temb to attn (only to resnet). remove from KCrossAttnDownBlock2D,KCrossAttnUpBlock2D#forward.
* remove encoder_attention_mask param from SimpleCrossAttn{Up,Down}Block2D,UNetMidBlock2DSimpleCrossAttn, and mask-choice in those blocks' #forward, on the basis that they only do one type of attention, so the consumer can pass whichever type of attention_mask is appropriate.
* put attention mask padding back to how it was (since the SD use-case it enabled wasn't important, and it breaks the original unclip use-case). disable the test which was added.
* fix-copies
* style
* fix-copies
* put encoder_attention_mask param back into Simple block forward interfaces, to ensure consistency of forward interface.
* restore passing of emb to KAttentionBlock#forward, on the basis that removal caused test failures. restore also the passing of emb to checkpointed calls to KAttentionBlock#forward.
* make simple unet2d blocks use encoder_attention_mask, but only when attention_mask is None. this should fix UnCLIP compatibility.
* fix copies
* Fix DPM single
* add test
* fix one more bug
* Apply suggestions from code review
Co-authored-by: StAlKeR7779 <stalkek7779@yandex.ru>
---------
Co-authored-by: StAlKeR7779 <stalkek7779@yandex.ru>
* up
* fix more
* Apply suggestions from code review
* fix more
* fix more
* Check it
* Remove 16:8
* fix more
* fix more
* fix more
* up
* up
* Test only stable diffusion
* Test only two files
* up
* Try out spinning up processes that can be killed
* up
* Apply suggestions from code review
* up
* up
* Remove ONNX tests from PR.
They are already a part of push_tests.yml.
* Remove mps tests from PRs.
They are already performed on push.
* Fix workflow name for fast push tests.
* Extract mps tests to a workflow.
For better control/filtering.
* Remove --extra-index-url from mps tests
* Increase tolerance of mps test
This test passes in my Mac (Ventura 13.3) but fails in the CI hardware
(Ventura 13.2). I ran the local tests following the same steps that
exist in the CI workflow.
* Temporarily run mps tests on pr
So we can test.
* Revert "Temporarily run mps tests on pr"
Tests passed, go back to running on push.
* Added explanation of 'strength' parameter
* Added get_timesteps function which relies on new strength parameter
* Added `strength` parameter which defaults to 1.
* Swapped ordering so `noise_timestep` can be calculated before masking the image
this is required when you aren't applying 100% noise to the masked region, e.g. strength < 1.
* Added strength to check_inputs, throws error if out of range
* Changed `prepare_latents` to initialise latents w.r.t strength
inspired from the stable diffusion img2img pipeline, init latents are initialised by converting the init image into a VAE latent and adding noise (based upon the strength parameter passed in), e.g. random when strength = 1, or the init image at strength = 0.
* WIP: Added a unit test for the new strength parameter in the StableDiffusionInpaintingPipeline
still need to add correct regression values
* Created a is_strength_max to initialise from pure random noise
* Updated unit tests w.r.t new strength parameter + fixed new strength unit test
* renamed parameter to avoid confusion with variable of same name
* Updated regression values for new strength test - now passes
* removed 'copied from' comment as this method is now different and divergent from the cpy
* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Ensure backwards compatibility for prepare_mask_and_masked_image
created a return_image boolean and initialised to false
* Ensure backwards compatibility for prepare_latents
* Fixed copy check typo
* Fixes w.r.t backward compibility changes
* make style
* keep function argument ordering same for backwards compatibility in callees with copied from statements
* make fix-copies
---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: William Berman <WLBberman@gmail.com>
* refactor controlnet and add img2img and inpaint
* First draft to get pipelines to work
* make style
* Fix more
* Fix more
* More tests
* Fix more
* Make inpainting work
* make style and more tests
* Apply suggestions from code review
* up
* make style
* Fix imports
* Fix more
* Fix more
* Improve examples
* add test
* Make sure import is correctly deprecated
* Make sure everything works in compile mode
* make sure authorship is correctly attributed
* enable deterministic pytorch and cuda operations.
* disable manual seeding.
* make style && make quality for unet_2d tests.
* enable determinism for the unet2dconditional model.
* add CUBLAS_WORKSPACE_CONFIG for better reproducibility.
* relax tolerance (very weird issue, though).
* revert to torch manual_seed() where needed.
* relax more tolerance.
* better placement of the cuda variable and relax more tolerance.
* enable determinism for 3d condition model.
* relax tolerance.
* add: determinism to alt_diffusion.
* relax tolerance for alt diffusion.
* dance diffusion.
* dance diffusion is flaky.
* test_dict_tuple_outputs_equivalent edit.
* fix two more tests.
* fix more ddim tests.
* fix: argument.
* change to diff in place of difference.
* fix: test_save_load call.
* test_save_load_float16 call.
* fix: expected_max_diff
* fix: paint by example.
* relax tolerance.
* add determinism to 1d unet model.
* torch 2.0 regressions seem to be brutal
* determinism to vae.
* add reason to skipping.
* up tolerance.
* determinism to vq.
* determinism to cuda.
* determinism to the generic test pipeline file.
* refactor general pipelines testing a bit.
* determinism to alt diffusion i2i
* up tolerance for alt diff i2i and audio diff
* up tolerance.
* determinism to audioldm
* increase tolerance for audioldm lms.
* increase tolerance for paint by paint.
* increase tolerance for repaint.
* determinism to cycle diffusion and sd 1.
* relax tol for cycle diffusion 🚲
* relax tol for sd 1.0
* relax tol for controlnet.
* determinism to img var.
* relax tol for img variation.
* tolerance to i2i sd
* make style
* determinism to inpaint.
* relax tolerance for inpaiting.
* determinism for inpainting legacy
* relax tolerance.
* determinism to instruct pix2pix
* determinism to model editing.
* model editing tolerance.
* panorama determinism
* determinism to pix2pix zero.
* determinism to sag.
* sd 2. determinism
* sd. tolerance
* disallow tf32 matmul.
* relax tolerance is all you need.
* make style and determinism to sd 2 depth
* relax tolerance for depth.
* tolerance to diffedit.
* tolerance to sd 2 inpaint.
* up tolerance.
* determinism in upscaling.
* tolerance in upscaler.
* more tolerance relaxation.
* determinism to v pred.
* up tol for v_pred
* unclip determinism
* determinism to unclip img2img
* determinism to text to video.
* determinism to last set of tests
* up tol.
* vq cumsum doesn't have a deterministic kernel
* relax tol
* relax tol
* add inferring_controlnet_cond_batch
* Revert "add inferring_controlnet_cond_batch"
This reverts commit abe8d6311d.
* set guess_mode to True
whenever global_pool_conditions is True
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* nit
* add integration test
---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* StableDiffusionInpaintingPipeline now resizes input images and masks w.r.t to passed input height and width. Default is already set to 512. This addresses the common tensor mismatch error. Also moved type check into relevant funciton to keep main pipeline body tidy.
* Fixed StableDiffusionInpaintingPrepareMaskAndMaskedImageTests
Due to previous commit these tests were failing as height and width need to be passed into the prepare_mask_and_masked_image function, I have updated the code and added a height/width variable per unit test as it seemed more appropriate than the current hard coded solution
* Added a resolution test to StableDiffusionInpaintPipelineSlowTests
this unit test simply gets the input and resizes it into some that would fail (e.g. would throw a tensor mismatch error/not a mult of 8). Then passes it through the pipeline and verifies it produces output with correct dims w.r.t the passed height and width
---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Batched load of textual inversions
- Only call resize_token_embeddings once per batch as it is the most expensive operation
- Allow pretrained_model_name_or_path and token to be an optional list
- Remove Dict from type annotation pretrained_model_name_or_path as it was not supported in this function
- Add comment that single files (e.g. .pt/.safetensors) are supported
- Add comment for token parameter
- Convert token override log message from warning to info
* Update src/diffusers/loaders.py
Check for duplicate tokens
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update condition for None tokens
---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* fix multistep dpmsolver for cosine schedule (deepfloy-if)
* fix a typo
* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* update all dpmsolver (singlestep, multistep, dpm, dpm++) for cosine noise schedule
* add test, fix style
---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Fix more torch compile breaks
* add tests
* Fix all
* fix controlnet
* fix more
* Add Horace He as co-author.
>
>
Co-authored-by: Horace He <horacehe2007@yahoo.com>
* Add Horace He as co-author.
Co-authored-by: Horace He <horacehe2007@yahoo.com>
---------
Co-authored-by: Horace He <horacehe2007@yahoo.com>
* fix more
* Fix more
* fix more
* Apply suggestions from code review
* fix
* make style
* make fix-copies
* fix
* make sure torch compile
* Clean
* fix test
* Update Pix2PixZero Auto-correlation Loss
* Add Stable Diffusion DiffEdit pipeline
* Add draft documentation and import code
* Bugfixes and refactoring
* Add option to not decode latents in the inversion process
* Harmonize preprocessing
* Revert "Update Pix2PixZero Auto-correlation Loss"
This reverts commit b218062fed.
* Update annotations
* rename `compute_mask` to `generate_mask`
* Update documentation
* Update docs
* Update Docs
* Fix copy
* Change shape of output latents to batch first
* Update docs
* Add first draft for tests
* Bugfix and update tests
* Add `cross_attention_kwargs` support for all pipeline methods
* Fix Copies
* Add support for PIL image latents
Add support for mask broadcasting
Update docs and tests
Align `mask` argument to `mask_image`
Remove height and width arguments
* Enable MPS Tests
* Move example docstrings
* Fix test
* Fix test
* fix pipeline inheritance
* Harmonize `prepare_image_latents` with StableDiffusionPix2PixZeroPipeline
* Register modules set to `None` in config for `test_save_load_optional_components`
* Move fixed logic to specific test class
* Clean changes to other pipelines
* Update new tests to coordinate with #2953
* Update slow tests for better results
* Safety to avoid potential problems with torch.inference_mode
* Add reference in SD Pipeline Overview
* Fix tests again
* Enforce determinism in noise for generate_mask
* Fix copies
* Widen test tolerance for fp16 based on `test_stable_diffusion_upscale_pipeline_fp16`
* Add LoraLoaderMixin and update `prepare_image_latents`
* clean up repeat and reg
* bugfix
* Remove invalid args from docs
Suppress spurious warning by repeating image before latent to mask gen
* add
* clean
* up
* clean up more
* fix more tests
* Improve docs further
* improve
* more fixes docs
* Improve docs more
* Update src/diffusers/models/unet_2d_condition.py
* fix
* up
* update doc links
* make fix-copies
* add safety checker and watermarker to stage 3 doc page code snippets
* speed optimizations docs
* memory optimization docs
* make style
* add watermarking snippets to doc string examples
* make style
* use pt_to_pil helper functions in doc strings
* skip mps tests
* Improve safety
* make style
* new logic
* fix
* fix bad onnx design
* make new stable diffusion upscale pipeline model arguments optional
* define has_nsfw_concept when non-pil output type
* lowercase linked to notebook name
---------
Co-authored-by: William Berman <WLBberman@gmail.com>
When the token used for textual inversion does not have any special symbols (e.g. it is not surrounded by <>), the tokenizer does not properly split the replacement tokens. Adding a space for the padding tokens fixes this.
* Add karras pattern to discrete heun scheduler
* Add integration test
* Fix failing CI on pytorch test on M1 (mps)
---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update Pix2PixZero Auto-correlation Loss
* Add fast inversion tests
* Clarify purpose and mark as deprecated
Fix inversion prompt broadcasting
* Register modules set to `None` in config for `test_save_load_optional_components`
* Update new tests to coordinate with #2953
* add mixin class for pipeline from original sd ckpt
* Improve
* make style
* merge main into
* Improve more
* fix more
* up
* Apply suggestions from code review
* finish docs
* rename
* make style
---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
add custom timesteps test
add custom timesteps descending order check
docs
timesteps -> custom_timesteps
can only pass one of num_inference_steps and timesteps
* add guess mode (WIP)
* fix uncond/cond order
* support guidance_scale=1.0 and batch != 1
* remove magic coeff
* add docstring
* add intergration test
* add document to controlnet.mdx
* made the comments a bit more explanatory
* fix table
* fix: norm group test for UNet3D.
* chore: speed up the panorama tests (fast).
* set default value of _test_inference_batch_single_identical.
* fix: batch_sizes default value.