1
0
mirror of https://github.com/huggingface/diffusers.git synced 2026-01-29 07:22:12 +03:00
Commit Graph

316 Commits

Author SHA1 Message Date
Patrick von Platen
ef9590712a [Tests] Relax tolerance of flaky failing test (#3755)
relax tolerance slightly
2023-06-12 18:28:30 +02:00
Patrick von Platen
74fd735eb0 Add draft for lora text encoder scale (#3626)
* Add draft for lora text encoder scale

* Improve naming

* fix: training dreambooth lora script.

* Apply suggestions from code review

* Update examples/dreambooth/train_dreambooth_lora.py

* Apply suggestions from code review

* Apply suggestions from code review

* add lora mixin when fit

* add lora mixin when fit

* add lora mixin when fit

* fix more

* fix more

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-06-06 22:47:46 +01:00
Sayak Paul
8669e8313d [LoRA] feat: add lora attention processor for pt 2.0. (#3594)
* feat: add lora attention processor for pt 2.0.

* explicit context manager for SDPA.

* switch to flash attention

* make shapes compatible to work optimally with SDPA.

* fix: circular import problem.

* explicitly specify the flash attention kernel in sdpa

* fall back to efficient attention context manager.

* remove explicit dispatch.

* fix: removed processor.

* fix: remove optional from type annotation.

* feat: make changes regarding LoRAAttnProcessor2_0.

* remove confusing warning.

* formatting.

* relax tolerance for PT 2.0

* fix: loading message.

* remove unnecessary logging.

* add: entry to the docs.

* add: network_alpha argument.

* relax tolerance.
2023-06-06 14:56:05 +05:30
Takuma Mori
b45204ea5a Add function to remove monkey-patch for text encoder LoRA (#3649)
* merge undoable-monkeypatch

* remove TEXT_ENCODER_TARGET_MODULES, refactoring

* move create_lora_weight_file
2023-06-06 14:06:13 +05:30
Will Berman
41ae670828 move activation dispatches into helper function (#3656)
* move activation dispatches into helper function

* tests
2023-06-05 12:30:48 -07:00
Takuma Mori
8e552bb4fe Support Kohya-ss style LoRA file format (in a limited capacity) (#3437)
* add _convert_kohya_lora_to_diffusers

* make style

* add scaffold

* match result: unet attention only

* fix monkey-patch for text_encoder

* with CLIPAttention

While the terrible images are no longer produced,
the results do not match those from the hook ver.
This may be due to not setting the network_alpha value.

* add to support network_alpha

* generate diff image

* fix monkey-patch for text_encoder

* add test_text_encoder_lora_monkey_patch()

* verify that it's okay to release the attn_procs

* fix closure version

* add comment

* Revert "fix monkey-patch for text_encoder"

This reverts commit bb9c61e6fa.

* Fix to reuse utility functions

* make LoRAAttnProcessor targets to self_attn

* fix LoRAAttnProcessor target

* make style

* fix split key

* Update src/diffusers/loaders.py

* remove TEXT_ENCODER_TARGET_MODULES loop

* add print memory usage

* remove test_kohya_loras_scaffold.py

* add: doc on LoRA civitai

* remove print statement and refactor in the doc.

* fix state_dict test for kohya-ss style lora

* Apply suggestions from code review

Co-authored-by: Takuma Mori <takuma104@gmail.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-06-02 17:40:24 +05:30
Takuma Mori
67cf0445ef Fix to apply LoRAXFormersAttnProcessor instead of LoRAAttnProcessor when xFormers is enabled (#3556)
* fix to use LoRAXFormersAttnProcessor

* add test

* using new LoraLoaderMixin.save_lora_weights

* add test_lora_save_load_with_xformers
2023-05-26 17:33:25 +05:30
Pedro Cuenca
bde2cb5d9b Run torch.compile tests in separate subprocesses (#3503)
* Run ControlNet compile test in a separate subprocess

`torch.compile()` spawns several subprocesses and the GPU memory used
was not reclaimed after the test ran. This approach was taken from
`transformers`.

* Style

* Prepare a couple more compile tests to run in subprocess.

* Use require_torch_2 decorator.

* Test inpaint_compile in subprocess.

* Run img2img compile test in subprocess.

* Run stable diffusion compile test in subprocess.

* style

* Temporarily trigger on pr to test.

* Revert "Temporarily trigger on pr to test."

This reverts commit 82d76868dd.
2023-05-23 19:24:17 +02:00
Birch-san
64bf5d33b7 Support for cross-attention bias / mask (#2634)
* Cross-attention masks

prefer qualified symbol, fix accidental Optional

prefer qualified symbol in AttentionProcessor

prefer qualified symbol in embeddings.py

qualified symbol in transformed_2d

qualify FloatTensor in unet_2d_blocks

move new transformer_2d params attention_mask, encoder_attention_mask to the end of the section which is assumed (e.g. by functions such as checkpoint()) to have a stable positional param interface. regard return_dict as a special-case which is assumed to be injected separately from positional params (e.g. by create_custom_forward()).

move new encoder_attention_mask param to end of CrossAttn block interfaces and Unet2DCondition interface, to maintain positional param interface.

regenerate modeling_text_unet.py

remove unused import

unet_2d_condition encoder_attention_mask docs

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

versatile_diffusion/modeling_text_unet.py encoder_attention_mask docs

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

transformer_2d encoder_attention_mask docs

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

unet_2d_blocks.py: add parameter name comments

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

revert description. bool-to-bias treatment happens in unet_2d_condition only.

comment parameter names

fix copies, style

* encoder_attention_mask for SimpleCrossAttnDownBlock2D, SimpleCrossAttnUpBlock2D

* encoder_attention_mask for UNetMidBlock2DSimpleCrossAttn

* support attention_mask, encoder_attention_mask in KCrossAttnDownBlock2D, KCrossAttnUpBlock2D, KAttentionBlock. fix binding of attention_mask, cross_attention_kwargs params in KCrossAttnDownBlock2D, KCrossAttnUpBlock2D checkpoint invocations.

* fix mistake made during merge conflict resolution

* regenerate versatile_diffusion

* pass time embedding into checkpointed attention invocation

* always assume encoder_attention_mask is a mask (i.e. not a bias).

* style, fix-copies

* add tests for cross-attention masks

* add test for padding of attention mask

* explain mask's query_tokens dim. fix explanation about broadcasting over channels; we actually broadcast over query tokens

* support both masks and biases in Transformer2DModel#forward. document behaviour

* fix-copies

* delete attention_mask docs on the basis I never tested self-attention masking myself. not comfortable explaining it, since I don't actually understand how a self-attn mask can work in its current form: the key length will be different in every ResBlock (we don't downsample the mask when we downsample the image).

* review feedback: the standard Unet blocks shouldn't pass temb to attn (only to resnet). remove from KCrossAttnDownBlock2D,KCrossAttnUpBlock2D#forward.

* remove encoder_attention_mask param from SimpleCrossAttn{Up,Down}Block2D,UNetMidBlock2DSimpleCrossAttn, and mask-choice in those blocks' #forward, on the basis that they only do one type of attention, so the consumer can pass whichever type of attention_mask is appropriate.

* put attention mask padding back to how it was (since the SD use-case it enabled wasn't important, and it breaks the original unclip use-case). disable the test which was added.

* fix-copies

* style

* fix-copies

* put encoder_attention_mask param back into Simple block forward interfaces, to ensure consistency of forward interface.

* restore passing of emb to KAttentionBlock#forward, on the basis that removal caused test failures. restore also the passing of emb to checkpointed calls to KAttentionBlock#forward.

* make simple unet2d blocks use encoder_attention_mask, but only when attention_mask is None. this should fix UnCLIP compatibility.

* fix copies
2023-05-22 17:27:15 +01:00
Patrick von Platen
51843fd7d0 Refactor full determinism (#3485)
* up

* fix more

* Apply suggestions from code review

* fix more

* fix more

* Check it

* Remove 16:8

* fix more

* fix more

* fix more

* up

* up

* Test only stable diffusion

* Test only two files

* up

* Try out spinning up processes that can be killed

* up

* Apply suggestions from code review

* up

* up
2023-05-22 11:15:11 +01:00
Will Berman
49b7ccfb96 parameterize pass single args through tuple (#3477) 2023-05-18 10:14:29 -07:00
Will Berman
909742dbd6 attention refactor: the trilogy (#3387)
* Replace `AttentionBlock` with `Attention`

* use _from_deprecated_attn_block check re: @patrickvonplaten
2023-05-12 08:54:09 -06:00
Sayak Paul
90f5f3c4d4 [Tests] better determinism (#3374)
* enable deterministic pytorch and cuda operations.

* disable manual seeding.

* make style && make quality for unet_2d tests.

* enable determinism for the unet2dconditional model.

* add CUBLAS_WORKSPACE_CONFIG for better reproducibility.

* relax tolerance (very weird issue, though).

* revert to torch manual_seed() where needed.

* relax more tolerance.

* better placement of the cuda variable and relax more tolerance.

* enable determinism for 3d condition model.

* relax tolerance.

* add: determinism to alt_diffusion.

* relax tolerance for alt diffusion.

* dance diffusion.

* dance diffusion is flaky.

* test_dict_tuple_outputs_equivalent edit.

* fix two more tests.

* fix more ddim tests.

* fix: argument.

* change to diff in place of difference.

* fix: test_save_load call.

* test_save_load_float16 call.

* fix: expected_max_diff

* fix: paint by example.

* relax tolerance.

* add determinism to 1d unet model.

* torch 2.0 regressions seem to be brutal

* determinism to vae.

* add reason to skipping.

* up tolerance.

* determinism to vq.

* determinism to cuda.

* determinism to the generic test pipeline file.

* refactor general pipelines testing a bit.

* determinism to alt diffusion i2i

* up tolerance for alt diff i2i and audio diff

* up tolerance.

* determinism to audioldm

* increase tolerance for audioldm lms.

* increase tolerance for paint by paint.

* increase tolerance for repaint.

* determinism to cycle diffusion and sd 1.

* relax tol for cycle diffusion 🚲

* relax tol for sd 1.0

* relax tol for controlnet.

* determinism to img var.

* relax tol for img variation.

* tolerance to i2i sd

* make style

* determinism to inpaint.

* relax tolerance for inpaiting.

* determinism for inpainting legacy

* relax tolerance.

* determinism to instruct pix2pix

* determinism to model editing.

* model editing tolerance.

* panorama determinism

* determinism to pix2pix zero.

* determinism to sag.

* sd 2. determinism

* sd. tolerance

* disallow tf32 matmul.

* relax tolerance is all you need.

* make style and determinism to sd 2 depth

* relax tolerance for depth.

* tolerance to diffedit.

* tolerance to sd 2 inpaint.

* up tolerance.

* determinism in upscaling.

* tolerance in upscaler.

* more tolerance relaxation.

* determinism to v pred.

* up tol for v_pred

* unclip determinism

* determinism to unclip img2img

* determinism to text to video.

* determinism to last set of tests

* up tol.

* vq cumsum doesn't have a deterministic kernel

* relax tol

* relax tol
2023-05-11 16:38:14 +01:00
Patrick von Platen
425192fe15 Make sure VAE attention works with Torch 2_0 (#3200)
* Make sure attention works with Torch 2_0

* make style

* Fix more
2023-04-22 17:29:29 +01:00
Patrick von Platen
17470057d2 make style 2023-04-20 13:09:20 +02:00
nupurkmr9
3979aac996 adding custom diffusion training to diffusers examples (#3031)
* diffusers==0.14.0 update

* custom diffusion update

* custom diffusion update

* custom diffusion update

* custom diffusion update

* custom diffusion update

* custom diffusion update

* custom diffusion

* custom diffusion

* custom diffusion

* custom diffusion

* custom diffusion

* apply formatting and get rid of bare except.

* refactor readme and other minor changes.

* misc refactor.

* fix: repo_id issue and loaders logging bug.

* fix: save_model_card.

* fix: save_model_card.

* fix: save_model_card.

* add: doc entry.

* refactor doc,.

* custom diffusion

* custom diffusion

* custom diffusion

* apply style.

* remove tralining whitespace.

* fix: toctree entry.

* remove unnecessary print.

* custom diffusion

* custom diffusion

* custom diffusion test

* custom diffusion xformer update

* custom diffusion xformer update

* custom diffusion xformer update

---------

Co-authored-by: Nupur Kumari <nupurkumari@Nupurs-MacBook-Pro.local>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Nupur Kumari <nupurkumari@nupurs-mbp.wifi.local.cmu.edu>
2023-04-20 09:31:42 +02:00
Patrick von Platen
703307efcc Fix config deprecation (#3129)
* Better deprecation message

* Better deprecation message

* Better doc string

* Fixes

* fix more

* fix more

* Improve __getattr__

* correct more

* fix more

* fix

* Improve more

* more improvements

* fix more

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* make style

* Fix all rest & add tests & remove old deprecation fns

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-04-17 17:16:28 +01:00
Patrick von Platen
ed8fd38337 Improve deprecation warnings (#3131) 2023-04-17 16:19:11 +01:00
Patrick von Platen
3a9d7d9758 [Tests] parallelize (#3078)
* [Tests] parallelize

* finish folder structuring

* Parallelize tests more

* Correct saving of pipelines

* make sure logging level is correct

* try again

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-04-13 13:32:57 +01:00
Andy
9d7c08f95e [WIP] implement rest of the test cases (LoRA tests) (#2824)
* inital commit for lora test cases

* help a bit with lora for 3d

* fixed lora tests

* replaced redundant code

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-04-12 15:32:14 +05:30
Will Berman
c6180a311c add only cross attention to simple attention blocks (#3011)
* add only cross attention to simple attention blocks

* add test for only_cross_attention re: @patrickvonplaten

* mid_block_only_cross_attention better default

allow mid_block_only_cross_attention to default to
`only_cross_attention` when `only_cross_attention` is given
as a single boolean
2023-04-11 14:38:50 -07:00
Patrick von Platen
8b451eb63b Fix config prints and save, load of pipelines (#2849)
* [Config] Fix config prints and save, load

* Only use potential nn.Modules for dtype and device

* Correct vae image processor

* make sure in_channels is not accessed directly

* make sure in channels is only accessed via config

* Make sure schedulers only access config attributes

* Make sure to access config in SAG

* Fix vae processor and make style

* add tests

* uP

* make style

* Fix more naming issues

* Final fix with vae config

* change more
2023-04-11 13:35:42 +02:00
Sayak Paul
7139f0e874 fix: norm group test for UNet3D. (#2959) 2023-04-04 09:01:15 +01:00
Patrick von Platen
d36103a089 [Tests] Speed up test (#2919)
speed up test
2023-03-31 14:20:46 +01:00
Pedro Cuenca
b10f527577 Helper function to disable custom attention processors (#2791)
* Helper function to disable custom attention processors.

* Restore code deleted by mistake.

* Format

* Fix modeling_text_unet copy.
2023-03-27 20:31:19 +02:00
Sanchit Gandhi
b94880e536 Add AudioLDM (#2232)
* Add AudioLDM

* up

* add vocoder

* start unet

* unconditional unet

* clap, vocoder and vae

* clean-up: conversion scripts

* fix: conversion script token_type_ids

* clean-up: pipeline docstring

* tests: from SD

* clean-up: cpu offload vocoder instead of safety checker

* feat: adapt tests to audioldm

* feat: add docs

* clean-up: amend pipeline docstrings

* clean-up: make style

* clean-up: make fix-copies

* fix: add doc path to toctree

* clean-up: args for conversion script

* clean-up: paths to checkpoints

* fix: use conditional unet

* clean-up: make style

* fix: type hints for UNet

* clean-up: docstring for UNet

* clean-up: make style

* clean-up: remove duplicate in docstring

* clean-up: make style

* clean-up: make fix-copies

* clean-up: move imports to start in code snippet

* fix: pass cross_attention_dim as a list/tuple to unet

* clean-up: make fix-copies

* fix: update checkpoint path

* fix: unet cross_attention_dim in tests

* film embeddings -> class embeddings

* Apply suggestions from code review

Co-authored-by: Will Berman <wlbberman@gmail.com>

* fix: unet film embed to use existing args

* fix: unet tests to use existing args

* fix: make style

* fix: transformers import and version in init

* clean-up: make style

* Revert "clean-up: make style"

This reverts commit 5d6d1f8b32.

* clean-up: make style

* clean-up: use pipeline tester mixin tests where poss

* clean-up: skip attn slicing test

* fix: add torch dtype to docs

* fix: remove conversion script out of src

* fix: remove .detach from 1d waveform

* fix: reduce default num inf steps

* fix: swap height/width -> audio_length_in_s

* clean-up: make style

* fix: remove nightly tests

* fix: imports in conversion script

* clean-up: slim-down to two slow tests

* clean-up: slim-down fast tests

* fix: batch consistent tests

* clean-up: make style

* clean-up: remove vae slicing fast test

* clean-up: propagate changes to doc

* fix: increase test tol to 1e-2

* clean-up: finish docs

* clean-up: make style

* feat: vocoder / VAE compatibility check

* feat: possibly expand / cut audio waveform

* fix: pipeline call signature test

* fix: slow tests output len

* clean-up: make style

* make style

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: William Berman <WLBberman@gmail.com>
2023-03-23 19:00:21 +01:00
Pedro Cuenca
aa0531fa8d Skip mps in text-to-video tests (#2792)
* Skip mps in text-to-video tests.

* style

* Skip UNet3D mps tests.
2023-03-23 14:39:03 +01:00
Pedro Cuenca
92e1164e2e mps: remove warmup passes (#2771)
* Remove warmup passes in mps tests.

* Update mps docs: no warmup pass in PyTorch 2

* Update imports.
2023-03-22 19:29:27 +01:00
Patrick von Platen
ca1a22296d [MS Text To Video] Add first text to video (#2738)
* [MS Text To Video} Add first text to video

* upload

* make first model example

* match unet3d params

* make sure weights are correcctly converted

* improve

* forward pass works, but diff result

* make forward work

* fix more

* finish

* refactor video output class.

* feat: add support for a video export utility.

* fix: opencv availability check.

* run make fix-copies.

* add: docs for the model components.

* add: standalone pipeline doc.

* edit docstring of the pipeline.

* add: right path to TransformerTempModel

* add: first set of tests.

* complete fast tests for text to video.

* fix bug

* up

* three fast tests failing.

* add: note on slow tests

* make work with all schedulers

* apply styling.

* add slow tests

* change file name

* update

* more correction

* more fixes

* finish

* up

* Apply suggestions from code review

* up

* finish

* make copies

* fix pipeline tests

* fix more tests

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* apply suggestions

* up

* revert

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-03-22 18:39:33 +01:00
Alexander Pivovarov
f024e00398 Fix typos (#2715)
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-03-21 13:45:04 +01:00
Patrick von Platen
9ecd924859 [Tests] Correct PT2 (#2724)
* [Tests] Correct PT2

* correct more

* move versatile to nightly

* up

* up

* again

* Apply suggestions from code review
2023-03-18 18:38:04 +01:00
Andy
116f70cbf8 Enabling gradient checkpointing for VAE (#2536)
* updated black format

* update black format

* make style format

* updated line endings

* update code formatting

* Update examples/research_projects/onnxruntime/text_to_image/train_text_to_image.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/models/vae.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/models/vae.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* added vae gradient checkpointing test

* make style

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Will Berman <wlbberman@gmail.com>
2023-03-17 14:59:38 -07:00
Nicolas Patry
d9227cf788 Adding use_safetensors argument to give more control to users (#2123)
* Adding `use_safetensors` argument to give more control to users

about which weights they use.

* Doc style.

* Rebased (not functional).

* Rebased and functional with tests.

* Style.

* Apply suggestions from code review

* Style.

* Addressing comments.

* Update tests/test_pipelines.py

Co-authored-by: Will Berman <wlbberman@gmail.com>

* Black ???

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Will Berman <wlbberman@gmail.com>
2023-03-16 15:57:43 +01:00
Patrick von Platen
e828232780 Rename attention (#2691)
* rename file

* rename attention

* fix more

* rename more

* up

* more deprecation imports

* fixes
2023-03-16 00:35:54 +01:00
Kashif Rasul
cf4227cd1e T5Attention support for cross-attention (#2654)
* fix AttnProcessor2_0

Fix use of AttnProcessor2_0 for cross attention with mask

* added scale_qk and out_bias flags

* fixed for xformers

* check if it has scale argument

* Update cross_attention.py

* check torch version

* fix sliced attn

* style

* set scale

* fix test

* fixed addedKV processor

* revert back AttnProcessor2_0

* if missing if

* fix inner_dim

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-03-15 18:04:05 +01:00
Nicolas Patry
2ea1da89ab Fix regression introduced in #2448 (#2551)
* Fix regression introduced in #2448

* Style.
2023-03-04 16:11:57 +01:00
Nicolas Patry
1f4deb697f Adding support for safetensors and LoRa. (#2448)
* Adding support for `safetensors` and LoRa.

* Adding metadata.
2023-03-03 18:00:19 +01:00
Patrick von Platen
eadf0e2555 [Copyright] 2023 (#2524) 2023-03-01 10:31:00 +01:00
Pedro Cuenca
54bc882d96 mps test fixes (#2470)
* Skip variant tests (UNet1d, UNetRL) on mps.

mish op not yet supported.

* Exclude a couple of panorama tests on mps

They are too slow for fast CI.

* Exclude mps panorama from more tests.

* mps: exclude all fast panorama tests as they keep failing.
2023-02-24 15:19:53 +01:00
bddppq
5d4f59ee96 Fix running LoRA with xformers (#2286)
* Fix running LoRA with xformers

* support disabling xformers

* reformat

* Add test
2023-02-13 11:58:18 +01:00
Patrick von Platen
a7ca03aa85 Replace flake8 with ruff and update black (#2279)
* before running make style

* remove left overs from flake8

* finish

* make fix-copies

* final fix

* more fixes
2023-02-07 23:46:23 +01:00
YiYi Xu
1051ca81a6 Stable Diffusion Latent Upscaler (#2059)
* Modify UNet2DConditionModel

- allow skipping mid_block

- adding a norm_group_size argument so that we can set the `num_groups` for group norm using `num_channels//norm_group_size`

- allow user to set dimension for the timestep embedding (`time_embed_dim`)

- the kernel_size for `conv_in` and `conv_out` is now configurable

- add random fourier feature layer (`GaussianFourierProjection`) for `time_proj`

- allow user to add the time and class embeddings before passing through the projection layer together - `time_embedding(t_emb + class_label))`

- added 2 arguments `attn1_types` and `attn2_types`

  * currently we have argument `only_cross_attention`: when it's set to `True`, we will have a to the
`BasicTransformerBlock` block with 2 cross-attention , otherwise we
get a self-attention followed by a cross-attention; in k-upscaler, we need to have blocks that include just one cross-attention, or self-attention -> cross-attention;
so I added `attn1_types` and `attn2_types` to the unet's argument list to allow user specify the attention types for the 2 positions in each block;  note that I stil kept
the `only_cross_attention` argument for unet for easy configuration, but it will be converted to `attn1_type` and `attn2_type` when passing down to the down blocks

- the position of downsample layer and upsample layer is now configurable

- in k-upscaler unet, there is only one skip connection per each up/down block (instead of each layer in stable diffusion unet), added `skip_freq = "block"` to support
this use case

- if user passes attention_mask to unet, it will prepare the mask and pass a flag to cross attention processer to skip the `prepare_attention_mask` step
inside cross attention block

add up/down blocks for k-upscaler

modify CrossAttention class

- make the `dropout` layer in `to_out` optional

- `use_conv_proj` - use conv instead of linear for all projection layers (i.e. `to_q`, `to_k`, `to_v`, `to_out`) whenever possible. note that when it's used to do cross
attention, to_k, to_v has to be linear because the `encoder_hidden_states` is not 2d

- `cross_attention_norm` - add an optional layernorm on encoder_hidden_states

- `attention_dropout`: add an optional dropout on attention score

adapt BasicTransformerBlock

- add an ada groupnorm layer  to conditioning attention input with timestep embedding

- allow skipping the FeedForward layer in between the attentions

- replaced the only_cross_attention argument with attn1_type and attn2_type for more flexible configuration

update timestep embedding: add new act_fn  gelu and an optional act_2

modified ResnetBlock2D

- refactored with AdaGroupNorm class (the timestep scale shift normalization)

- add `mid_channel` argument - allow the first conv to have a different output dimension from the second conv

- add option to use input AdaGroupNorm on the input instead of groupnorm

- add options to add a dropout layer after each conv

- allow user to set the bias in conv_shortcut (needed for k-upscaler)

- add gelu

adding conversion script for k-upscaler unet

add pipeline

* fix attention mask

* fix a typo

* fix a bug

* make sure model can be used with GPU

* make pipeline work with fp16

* fix an error in BasicTransfomerBlock

* make style

* fix typo

* some more fixes

* uP

* up

* correct more

* some clean-up

* clean time proj

* up

* uP

* more changes

* remove the upcast_attention=True from unet config

* remove attn1_types, attn2_types etc

* fix

* revert incorrect changes up/down samplers

* make style

* remove outdated files

* Apply suggestions from code review

* attention refactor

* refactor cross attention

* Apply suggestions from code review

* update

* up

* update

* Apply suggestions from code review

* finish

* Update src/diffusers/models/cross_attention.py

* more fixes

* up

* up

* up

* finish

* more corrections of conversion state

* act_2 -> act_2_fn

* remove dropout_after_conv from ResnetBlock2D

* make style

* simplify KAttentionBlock

* add fast test for latent upscaler pipeline

* add slow test

* slow test fp16

* make style

* add doc string for pipeline_stable_diffusion_latent_upscale

* add api doc page for latent upscaler pipeline

* deprecate attention mask

* clean up embeddings

* simplify resnet

* up

* clean up resnet

* up

* correct more

* up

* up

* improve a bit more

* correct more

* more clean-ups

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add docstrings for new unet config

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* # Copied from

* encode the image if not latent

* remove force casting vae to fp32

* fix

* add comments about preconditioning parameters from k-diffusion paper

* attn1_type, attn2_type -> add_self_attention

* clean up get_down_block and get_up_block

* fix

* fixed a typo(?) in ada group norm

* update slice attention processer for cross attention

* update slice

* fix fast test

* update the checkpoint

* finish tests

* fix-copies

* fix-copy for modeling_text_unet.py

* make style

* make style

* fix f-string

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix import

* correct changes

* fix resnet

* make fix-copies

* correct euler scheduler

* add missing #copied from for preprocess

* revert

* fix

* fix copies

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/models/cross_attention.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* clean up conversion script

* KDownsample2d,KUpsample2d -> KDownsample2D,KUpsample2D

* more

* Update src/diffusers/models/unet_2d_condition.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* remove prepare_extra_step_kwargs

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix a typo in timestep embedding

* remove num_image_per_prompt

* fix fasttest

* make style + fix-copies

* fix

* fix xformer test

* fix style

* doc string

* make style

* fix-copies

* docstring for time_embedding_norm

* make style

* final finishes

* make fix-copies

* fix tests

---------

Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-02-07 09:11:57 +01:00
Patrick von Platen
f653ded7ed [LoRA] Make sure LoRA can be disabled after it's run (#2128) 2023-01-26 21:26:11 +01:00
Patrick von Platen
6ba2231d72 Reproducibility 3/3 (#1924)
* make tests deterministic

* run slow tests

* prepare for testing

* finish

* refactor

* add print statements

* finish more

* correct some test failures

* more fixes

* set up to correct tests

* more corrections

* up

* fix more

* more prints

* add

* up

* up

* up

* uP

* uP

* more fixes

* uP

* up

* up

* up

* up

* fix more

* up

* up

* clean tests

* up

* up

* up

* more fixes

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* make

* correct

* finish

* finish

Co-authored-by: Suraj Patil <surajp815@gmail.com>
2023-01-25 13:44:22 +01:00
Patrick von Platen
ed616bd8a8 [LoRA] Add LoRA training script (#1884)
* [Lora] first upload

* add first lora version

* upload

* more

* first training

* up

* correct

* improve

* finish loaders and inference

* up

* up

* fix more

* up

* finish more

* finish more

* up

* up

* change year

* revert year change

* Change lines

* Add cloneofsimo as co-author.

Co-authored-by: Simo Ryu <cloneofsimo@gmail.com>

* finish

* fix docs

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* upload

* finish

Co-authored-by: Simo Ryu <cloneofsimo@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2023-01-18 18:05:51 +01:00
Patrick von Platen
29b2c93c90 Make repo structure consistent (#1862)
* move files a bit

* more refactors

* fix more

* more fixes

* fix more onnx

* make style

* upload

* fix

* up

* fix more

* up again

* up

* small fix

* Update src/diffusers/__init__.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* correct

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2022-12-30 11:51:08 +01:00
Patrick von Platen
4125756e88 Refactor cross attention and allow mechanism to tweak cross attention function (#1639)
* first proposal

* rename

* up

* Apply suggestions from code review

* better

* up

* finish

* up

* rename

* correct versatile

* up

* up

* up

* up

* fix

* Apply suggestions from code review

* make style

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* add error message

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2022-12-20 18:49:05 +01:00
Patrick von Platen
cd91fc06fe Re-add xformers enable to UNet2DCondition (#1627)
* finish

* fix

* Update tests/models/test_models_unet_2d.py

* style

Co-authored-by: Anton Lozhkov <anton@huggingface.co>
2022-12-09 14:05:38 +01:00
Suraj Patil
bce65cd13a [refactor] make set_attention_slice recursive (#1532)
* make attn slice recursive

* remove set_attention_slice from blocks

* fix copies

* make enable_attention_slicing base class method of DiffusionPipeline

* fix set_attention_slice

* fix set_attention_slice

* fix copies

* add tests

* up

* up

* up

* update

* up

* uP

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-12-05 17:31:04 +01:00
Anton Lozhkov
cc22bda5f6 [CI] Add slow MPS tests (#1104)
* [CI] Add slow MPS tests

* fix yml

* temporarily resolve caching

* Tests: fix mps crashes.

* Skip test_load_pipeline_from_git on mps.

Not compatible with float16.

* Increase tolerance, use CPU generator, alt. slices.

* Move to nightly

* style

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2022-12-05 11:50:24 +01:00