1
0
mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00
Commit Graph

87 Commits

Author SHA1 Message Date
Patrick von Platen
ea1fcc28a4 [SDXL] Allow SDXL LoRA to be run with less than 16GB of VRAM (#4470)
* correct

* correct blocks

* finish

* finish

* finish

* Apply suggestions from code review

* fix

* up

* up

* up

* Update examples/dreambooth/README_sdxl.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Apply suggestions from code review

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-08-04 20:06:38 +02:00
Sayak Paul
06f73bd6d1 [Tests] Adds integration tests for SDXL LoRAs (#4462)
* add: integration tests for SDXL LoRAs.

* change pipeline class.

* fix assertion values.

* print values again.

* let's see.

* let's see.

* let's see.

* finish
2023-08-04 16:25:53 +05:30
Sayak Paul
18fc40c169 [Feat] add tiny Autoencoder for (almost) instant decoding (#4384)
* add: model implementation of tiny autoencoder.

* add: inits.

* push the latest devs.

* add: conversion script and finish.

* add: scaling factor args.

* debugging

* fix denormalization.

* fix: positional argument.

* handle use_torch_2_0_or_xformers.

* handle post_quant_conv

* handle dtype

* fix: sdxl image processor for tiny ae.

* fix: sdxl image processor for tiny ae.

* unify upcasting logic.

* copied from madness.

* remove trailing whitespace.

* set is_tiny_vae = False

* address PR comments.

* change to AutoencoderTiny

* make act_fn an str throughout

* fix: apply_forward_hook decorator call

* get rid of the special is_tiny_vae flag.

* directly scale the output.

* fix dummies?

* fix: act_fn.

* get rid of the Clamp() layer.

* bring back copied from.

* movement of the blocks to appropriate modules.

* add: docstrings to AutoencoderTiny

* add: documentation.

* changes to the conversion script.

* add doc entry.

* settle tests.

* style

* add one slow test.

* fix

* fix 2

* fix 2

* fix: 4

* fix: 5

* finish integration tests

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* style

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-08-02 23:58:05 +05:30
Sayak Paul
816ca0048f [LoRA] Fix SDXL text encoder LoRAs (#4371)
* temporarily disable text encoder loras.

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debbuging.

* modify doc.

* rename tests.

* print slices.

* fix: assertions

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-02 17:00:56 +05:30
Sayak Paul
4a4cdd6b07 [Feat] Support SDXL Kohya-style LoRA (#4287)
* sdxl lora changes.

* better name replacement.

* better replacement.

* debugging

* debugging

* debugging

* debugging

* debugging

* remove print.

* print state dict keys.

* print

* distingisuih better

* debuggable.

* fxi: tyests

* fix: arg from training script.

* access from class.

* run style

* debug

* save intermediate

* some simplifications for SDXL LoRA

* styling

* unet config is not needed in diffusers format.

* fix: dynamic SGM block mapping for SDXL kohya loras (#4322)

* Use lora compatible layers for linear proj_in/proj_out (#4323)

* improve condition for using the sgm_diffusers mapping

* informative comment.

* load compatible keys and embedding layer maaping.

* Get SDXL 1.0 example lora to load

* simplify

* specif ranks and hidden sizes.

* better handling of k rank and hidden

* debug

* debug

* debug

* debug

* debug

* fix: alpha keys

* add check for handling LoRAAttnAddedKVProcessor

* sanity comment

* modifications for text encoder SDXL

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* denugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* up

* up

* up

* up

* up

* up

* unneeded comments.

* unneeded comments.

* kwargs for the other attention processors.

* kwargs for the other attention processors.

* debugging

* debugging

* debugging

* debugging

* improve

* debugging

* debugging

* more print

* Fix alphas

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* clean up

* clean up.

* debugging

* fix: text

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Batuhan Taskaya <batuhan@python.org>
2023-07-28 19:49:49 +02:00
Patrick von Platen
1926331eaf [Local loading] Correct bug with local files only (#4318)
* [Local loading] Correct bug with local files only

* file not found error

* fix

* finish
2023-07-27 16:16:46 +02:00
Batuhan Taskaya
ff8f58086b Load Kohya-ss style LoRAs with auxilary states (#4147)
* Support to load Kohya-ss style LoRA file format (without restrictions)

Co-Authored-By: Takuma Mori <takuma104@gmail.com>
Co-Authored-By: Sayak Paul <spsayakpaul@gmail.com>

* tmp: add sdxl to mlp_modules

---------

Co-authored-by: Takuma Mori <takuma104@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-07-26 00:24:19 +02:00
Sayak Paul
365e8461ac [SDXL DreamBooth LoRA] add support for text encoder fine-tuning (#4097)
* Allow low precision sd xl

* finish

* finish

* feat: initial draft for supporting text encoder lora finetuning for SDXL DreamBooth

* fix: variable assignments.

* add: autocast block.

* add debugging

* vae dtype hell

* fix: vae dtype hell.

* fix: vae dtype hell 3.

* clean up

* lora text encoder loader.

* fix: unwrapping models.

* add: tests.

* docs.

* handle unexpected keys.

* fix vae dtype in the final inference.

* fix scope problem.

* fix: save_model_card args.

* initialize: prefix to None.

* fix: dtype issues.

* apply gixes.

* debgging.

* debugging

* debugging

* debugging

* debugging

* debugging

* add: fast tests.

* pre-tokenize.

* address: will's comments.

* fix: loader and tests.

* fix: dataloader.

* simplify dataloader.

* length.

* simplification.

* make style && make quality

* simplify state_dict munging

* fix: tests.

* fix: state_dict packing.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-07-25 05:35:48 +05:30
Batuhan Taskaya
ad787082e2 Fix unloading of LoRAs when xformers attention procs are in use (#4179) 2023-07-21 14:29:20 +05:30
Ruslan Vorovchenko
07f1fbb18e Asymmetric vqgan (#3956)
* added AsymmetricAutoencoderKL

* fixed copies+dummy

* added script to convert original asymmetric vqgan

* added docs

* updated docs

* fixed style

* fixes, added tests

* update doc

* fixed doc

* fixed tests

* naming

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* naming

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* udpated code example

* updated doc

* comments fixes

* added docstring

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* comments fixes

* added inpaint pipeline tests

* comment suggestion: delete method

* yet another fixes

---------

Co-authored-by: Ruslan Vorovchenko <r.vorovchenko@prequelapp.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-07-20 17:51:06 +02:00
Patrick von Platen
6b1abba18d Add controlnet and vae from single file (#4084)
* Add controlnet from single file

* Updates

* make style

* finish

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-07-19 14:50:27 +02:00
Saurav Maheshkar
470f51cd26 feat: add act_fn param to OutValueFunctionBlock (#3994)
* feat: add act_fn param to OutValueFunctionBlock

* feat: update unet1d tests to not use mish

* feat: add `mish` as the default activation function

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* feat: drop mish tests from unet1d

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-07-19 14:44:44 +02:00
Sayak Paul
692b7a907d [Feat] add: utility for unloading lora. (#4034)
* add: test for testing unloading lora.

* add :reason to skipif.

* initial implementation of lora unload().

* apply styling.

* add: doc.

* change checkpoints.

* reinit generator

* finalize slow test.

* add fast test for unloading lora.
2023-07-14 16:30:18 +05:30
Will Berman
c2a28c346c Refactor LoRA (#3778)
* refactor to support patching LoRA into T5

instantiate the lora linear layer on the same device as the regular linear layer

get lora rank from state dict

tests

fmt

can create lora layer in float32 even when rest of model is float16

fix loading model hook

remove load_lora_weights_ and T5 dispatching

remove Unet#attn_processors_state_dict

docstrings

* text encoder monkeypatch class method

* fix test

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-07-09 18:02:46 +02:00
Prathik Rao
de1426119d Make UNet2DConditionOutput pickle-able (#3857)
* add default to unet output to prevent it from being a required arg

* add unit test

* make style

* adjust unit test

* mark as fast test

* adjust assert statement in test

---------

Co-authored-by: Prathik Rao <prathikrao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: root <root@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2023-07-06 13:12:41 +05:30
Patrick von Platen
332d2bbea3 Improve memory text to video (#3930)
* Improve memory text to video

* Apply suggestions from code review

* add test

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* finish test setup

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-07-03 18:17:34 +02:00
Patrick von Platen
219636f7e4 improve tolerance 2023-06-28 13:29:36 +00:00
Patrick von Platen
88d269461c Correct bad attn naming (#3797)
* relax tolerance slightly

* correct incorrect naming

* correct namingc

* correct more

* Apply suggestions from code review

* Fix more

* Correct more

* correct incorrect naming

* Update src/diffusers/models/controlnet.py

* Correct flax

* Correct renaming

* Correct blocks

* Fix more

* Correct more

* mkae style

* mkae style

* mkae style

* mkae style

* mkae style

* Fix flax

* mkae style

* rename

* rename

* rename attn head dim to attention_head_dim

* correct flax

* make style

* improve

* Correct more

* make style

* fix more

* mkae style

* Update src/diffusers/models/controlnet_flax.py

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-06-22 13:52:48 +02:00
Will Berman
0bab447670 relax tol attention conversion test (#3842) 2023-06-21 12:35:38 -07:00
Will Berman
59aefe9ea6 device map legacy attention block weight conversion (#3804) 2023-06-16 10:39:20 -07:00
Patrick von Platen
ea8ae8c639 Complete set_attn_processor for prior and vae (#3796)
* relax tolerance slightly

* Add more tests

* upload readme

* upload readme

* Apply suggestions from code review

* Improve API Autoencoder KL

* finalize

* finalize tests

* finalize tests

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* up

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-06-15 17:42:49 +02:00
Patrick von Platen
ef9590712a [Tests] Relax tolerance of flaky failing test (#3755)
relax tolerance slightly
2023-06-12 18:28:30 +02:00
Patrick von Platen
74fd735eb0 Add draft for lora text encoder scale (#3626)
* Add draft for lora text encoder scale

* Improve naming

* fix: training dreambooth lora script.

* Apply suggestions from code review

* Update examples/dreambooth/train_dreambooth_lora.py

* Apply suggestions from code review

* Apply suggestions from code review

* add lora mixin when fit

* add lora mixin when fit

* add lora mixin when fit

* fix more

* fix more

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-06-06 22:47:46 +01:00
Sayak Paul
8669e8313d [LoRA] feat: add lora attention processor for pt 2.0. (#3594)
* feat: add lora attention processor for pt 2.0.

* explicit context manager for SDPA.

* switch to flash attention

* make shapes compatible to work optimally with SDPA.

* fix: circular import problem.

* explicitly specify the flash attention kernel in sdpa

* fall back to efficient attention context manager.

* remove explicit dispatch.

* fix: removed processor.

* fix: remove optional from type annotation.

* feat: make changes regarding LoRAAttnProcessor2_0.

* remove confusing warning.

* formatting.

* relax tolerance for PT 2.0

* fix: loading message.

* remove unnecessary logging.

* add: entry to the docs.

* add: network_alpha argument.

* relax tolerance.
2023-06-06 14:56:05 +05:30
Takuma Mori
b45204ea5a Add function to remove monkey-patch for text encoder LoRA (#3649)
* merge undoable-monkeypatch

* remove TEXT_ENCODER_TARGET_MODULES, refactoring

* move create_lora_weight_file
2023-06-06 14:06:13 +05:30
Will Berman
41ae670828 move activation dispatches into helper function (#3656)
* move activation dispatches into helper function

* tests
2023-06-05 12:30:48 -07:00
Takuma Mori
8e552bb4fe Support Kohya-ss style LoRA file format (in a limited capacity) (#3437)
* add _convert_kohya_lora_to_diffusers

* make style

* add scaffold

* match result: unet attention only

* fix monkey-patch for text_encoder

* with CLIPAttention

While the terrible images are no longer produced,
the results do not match those from the hook ver.
This may be due to not setting the network_alpha value.

* add to support network_alpha

* generate diff image

* fix monkey-patch for text_encoder

* add test_text_encoder_lora_monkey_patch()

* verify that it's okay to release the attn_procs

* fix closure version

* add comment

* Revert "fix monkey-patch for text_encoder"

This reverts commit bb9c61e6fa.

* Fix to reuse utility functions

* make LoRAAttnProcessor targets to self_attn

* fix LoRAAttnProcessor target

* make style

* fix split key

* Update src/diffusers/loaders.py

* remove TEXT_ENCODER_TARGET_MODULES loop

* add print memory usage

* remove test_kohya_loras_scaffold.py

* add: doc on LoRA civitai

* remove print statement and refactor in the doc.

* fix state_dict test for kohya-ss style lora

* Apply suggestions from code review

Co-authored-by: Takuma Mori <takuma104@gmail.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-06-02 17:40:24 +05:30
Takuma Mori
67cf0445ef Fix to apply LoRAXFormersAttnProcessor instead of LoRAAttnProcessor when xFormers is enabled (#3556)
* fix to use LoRAXFormersAttnProcessor

* add test

* using new LoraLoaderMixin.save_lora_weights

* add test_lora_save_load_with_xformers
2023-05-26 17:33:25 +05:30
Pedro Cuenca
bde2cb5d9b Run torch.compile tests in separate subprocesses (#3503)
* Run ControlNet compile test in a separate subprocess

`torch.compile()` spawns several subprocesses and the GPU memory used
was not reclaimed after the test ran. This approach was taken from
`transformers`.

* Style

* Prepare a couple more compile tests to run in subprocess.

* Use require_torch_2 decorator.

* Test inpaint_compile in subprocess.

* Run img2img compile test in subprocess.

* Run stable diffusion compile test in subprocess.

* style

* Temporarily trigger on pr to test.

* Revert "Temporarily trigger on pr to test."

This reverts commit 82d76868dd.
2023-05-23 19:24:17 +02:00
Birch-san
64bf5d33b7 Support for cross-attention bias / mask (#2634)
* Cross-attention masks

prefer qualified symbol, fix accidental Optional

prefer qualified symbol in AttentionProcessor

prefer qualified symbol in embeddings.py

qualified symbol in transformed_2d

qualify FloatTensor in unet_2d_blocks

move new transformer_2d params attention_mask, encoder_attention_mask to the end of the section which is assumed (e.g. by functions such as checkpoint()) to have a stable positional param interface. regard return_dict as a special-case which is assumed to be injected separately from positional params (e.g. by create_custom_forward()).

move new encoder_attention_mask param to end of CrossAttn block interfaces and Unet2DCondition interface, to maintain positional param interface.

regenerate modeling_text_unet.py

remove unused import

unet_2d_condition encoder_attention_mask docs

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

versatile_diffusion/modeling_text_unet.py encoder_attention_mask docs

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

transformer_2d encoder_attention_mask docs

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

unet_2d_blocks.py: add parameter name comments

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

revert description. bool-to-bias treatment happens in unet_2d_condition only.

comment parameter names

fix copies, style

* encoder_attention_mask for SimpleCrossAttnDownBlock2D, SimpleCrossAttnUpBlock2D

* encoder_attention_mask for UNetMidBlock2DSimpleCrossAttn

* support attention_mask, encoder_attention_mask in KCrossAttnDownBlock2D, KCrossAttnUpBlock2D, KAttentionBlock. fix binding of attention_mask, cross_attention_kwargs params in KCrossAttnDownBlock2D, KCrossAttnUpBlock2D checkpoint invocations.

* fix mistake made during merge conflict resolution

* regenerate versatile_diffusion

* pass time embedding into checkpointed attention invocation

* always assume encoder_attention_mask is a mask (i.e. not a bias).

* style, fix-copies

* add tests for cross-attention masks

* add test for padding of attention mask

* explain mask's query_tokens dim. fix explanation about broadcasting over channels; we actually broadcast over query tokens

* support both masks and biases in Transformer2DModel#forward. document behaviour

* fix-copies

* delete attention_mask docs on the basis I never tested self-attention masking myself. not comfortable explaining it, since I don't actually understand how a self-attn mask can work in its current form: the key length will be different in every ResBlock (we don't downsample the mask when we downsample the image).

* review feedback: the standard Unet blocks shouldn't pass temb to attn (only to resnet). remove from KCrossAttnDownBlock2D,KCrossAttnUpBlock2D#forward.

* remove encoder_attention_mask param from SimpleCrossAttn{Up,Down}Block2D,UNetMidBlock2DSimpleCrossAttn, and mask-choice in those blocks' #forward, on the basis that they only do one type of attention, so the consumer can pass whichever type of attention_mask is appropriate.

* put attention mask padding back to how it was (since the SD use-case it enabled wasn't important, and it breaks the original unclip use-case). disable the test which was added.

* fix-copies

* style

* fix-copies

* put encoder_attention_mask param back into Simple block forward interfaces, to ensure consistency of forward interface.

* restore passing of emb to KAttentionBlock#forward, on the basis that removal caused test failures. restore also the passing of emb to checkpointed calls to KAttentionBlock#forward.

* make simple unet2d blocks use encoder_attention_mask, but only when attention_mask is None. this should fix UnCLIP compatibility.

* fix copies
2023-05-22 17:27:15 +01:00
Patrick von Platen
51843fd7d0 Refactor full determinism (#3485)
* up

* fix more

* Apply suggestions from code review

* fix more

* fix more

* Check it

* Remove 16:8

* fix more

* fix more

* fix more

* up

* up

* Test only stable diffusion

* Test only two files

* up

* Try out spinning up processes that can be killed

* up

* Apply suggestions from code review

* up

* up
2023-05-22 11:15:11 +01:00
Will Berman
49b7ccfb96 parameterize pass single args through tuple (#3477) 2023-05-18 10:14:29 -07:00
Will Berman
909742dbd6 attention refactor: the trilogy (#3387)
* Replace `AttentionBlock` with `Attention`

* use _from_deprecated_attn_block check re: @patrickvonplaten
2023-05-12 08:54:09 -06:00
Sayak Paul
90f5f3c4d4 [Tests] better determinism (#3374)
* enable deterministic pytorch and cuda operations.

* disable manual seeding.

* make style && make quality for unet_2d tests.

* enable determinism for the unet2dconditional model.

* add CUBLAS_WORKSPACE_CONFIG for better reproducibility.

* relax tolerance (very weird issue, though).

* revert to torch manual_seed() where needed.

* relax more tolerance.

* better placement of the cuda variable and relax more tolerance.

* enable determinism for 3d condition model.

* relax tolerance.

* add: determinism to alt_diffusion.

* relax tolerance for alt diffusion.

* dance diffusion.

* dance diffusion is flaky.

* test_dict_tuple_outputs_equivalent edit.

* fix two more tests.

* fix more ddim tests.

* fix: argument.

* change to diff in place of difference.

* fix: test_save_load call.

* test_save_load_float16 call.

* fix: expected_max_diff

* fix: paint by example.

* relax tolerance.

* add determinism to 1d unet model.

* torch 2.0 regressions seem to be brutal

* determinism to vae.

* add reason to skipping.

* up tolerance.

* determinism to vq.

* determinism to cuda.

* determinism to the generic test pipeline file.

* refactor general pipelines testing a bit.

* determinism to alt diffusion i2i

* up tolerance for alt diff i2i and audio diff

* up tolerance.

* determinism to audioldm

* increase tolerance for audioldm lms.

* increase tolerance for paint by paint.

* increase tolerance for repaint.

* determinism to cycle diffusion and sd 1.

* relax tol for cycle diffusion 🚲

* relax tol for sd 1.0

* relax tol for controlnet.

* determinism to img var.

* relax tol for img variation.

* tolerance to i2i sd

* make style

* determinism to inpaint.

* relax tolerance for inpaiting.

* determinism for inpainting legacy

* relax tolerance.

* determinism to instruct pix2pix

* determinism to model editing.

* model editing tolerance.

* panorama determinism

* determinism to pix2pix zero.

* determinism to sag.

* sd 2. determinism

* sd. tolerance

* disallow tf32 matmul.

* relax tolerance is all you need.

* make style and determinism to sd 2 depth

* relax tolerance for depth.

* tolerance to diffedit.

* tolerance to sd 2 inpaint.

* up tolerance.

* determinism in upscaling.

* tolerance in upscaler.

* more tolerance relaxation.

* determinism to v pred.

* up tol for v_pred

* unclip determinism

* determinism to unclip img2img

* determinism to text to video.

* determinism to last set of tests

* up tol.

* vq cumsum doesn't have a deterministic kernel

* relax tol

* relax tol
2023-05-11 16:38:14 +01:00
Patrick von Platen
425192fe15 Make sure VAE attention works with Torch 2_0 (#3200)
* Make sure attention works with Torch 2_0

* make style

* Fix more
2023-04-22 17:29:29 +01:00
Patrick von Platen
17470057d2 make style 2023-04-20 13:09:20 +02:00
nupurkmr9
3979aac996 adding custom diffusion training to diffusers examples (#3031)
* diffusers==0.14.0 update

* custom diffusion update

* custom diffusion update

* custom diffusion update

* custom diffusion update

* custom diffusion update

* custom diffusion update

* custom diffusion

* custom diffusion

* custom diffusion

* custom diffusion

* custom diffusion

* apply formatting and get rid of bare except.

* refactor readme and other minor changes.

* misc refactor.

* fix: repo_id issue and loaders logging bug.

* fix: save_model_card.

* fix: save_model_card.

* fix: save_model_card.

* add: doc entry.

* refactor doc,.

* custom diffusion

* custom diffusion

* custom diffusion

* apply style.

* remove tralining whitespace.

* fix: toctree entry.

* remove unnecessary print.

* custom diffusion

* custom diffusion

* custom diffusion test

* custom diffusion xformer update

* custom diffusion xformer update

* custom diffusion xformer update

---------

Co-authored-by: Nupur Kumari <nupurkumari@Nupurs-MacBook-Pro.local>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Nupur Kumari <nupurkumari@nupurs-mbp.wifi.local.cmu.edu>
2023-04-20 09:31:42 +02:00
Patrick von Platen
703307efcc Fix config deprecation (#3129)
* Better deprecation message

* Better deprecation message

* Better doc string

* Fixes

* fix more

* fix more

* Improve __getattr__

* correct more

* fix more

* fix

* Improve more

* more improvements

* fix more

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* make style

* Fix all rest & add tests & remove old deprecation fns

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-04-17 17:16:28 +01:00
Patrick von Platen
ed8fd38337 Improve deprecation warnings (#3131) 2023-04-17 16:19:11 +01:00
Patrick von Platen
3a9d7d9758 [Tests] parallelize (#3078)
* [Tests] parallelize

* finish folder structuring

* Parallelize tests more

* Correct saving of pipelines

* make sure logging level is correct

* try again

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-04-13 13:32:57 +01:00
Andy
9d7c08f95e [WIP] implement rest of the test cases (LoRA tests) (#2824)
* inital commit for lora test cases

* help a bit with lora for 3d

* fixed lora tests

* replaced redundant code

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-04-12 15:32:14 +05:30
Will Berman
c6180a311c add only cross attention to simple attention blocks (#3011)
* add only cross attention to simple attention blocks

* add test for only_cross_attention re: @patrickvonplaten

* mid_block_only_cross_attention better default

allow mid_block_only_cross_attention to default to
`only_cross_attention` when `only_cross_attention` is given
as a single boolean
2023-04-11 14:38:50 -07:00
Patrick von Platen
8b451eb63b Fix config prints and save, load of pipelines (#2849)
* [Config] Fix config prints and save, load

* Only use potential nn.Modules for dtype and device

* Correct vae image processor

* make sure in_channels is not accessed directly

* make sure in channels is only accessed via config

* Make sure schedulers only access config attributes

* Make sure to access config in SAG

* Fix vae processor and make style

* add tests

* uP

* make style

* Fix more naming issues

* Final fix with vae config

* change more
2023-04-11 13:35:42 +02:00
Sayak Paul
7139f0e874 fix: norm group test for UNet3D. (#2959) 2023-04-04 09:01:15 +01:00
Patrick von Platen
d36103a089 [Tests] Speed up test (#2919)
speed up test
2023-03-31 14:20:46 +01:00
Pedro Cuenca
b10f527577 Helper function to disable custom attention processors (#2791)
* Helper function to disable custom attention processors.

* Restore code deleted by mistake.

* Format

* Fix modeling_text_unet copy.
2023-03-27 20:31:19 +02:00
Sanchit Gandhi
b94880e536 Add AudioLDM (#2232)
* Add AudioLDM

* up

* add vocoder

* start unet

* unconditional unet

* clap, vocoder and vae

* clean-up: conversion scripts

* fix: conversion script token_type_ids

* clean-up: pipeline docstring

* tests: from SD

* clean-up: cpu offload vocoder instead of safety checker

* feat: adapt tests to audioldm

* feat: add docs

* clean-up: amend pipeline docstrings

* clean-up: make style

* clean-up: make fix-copies

* fix: add doc path to toctree

* clean-up: args for conversion script

* clean-up: paths to checkpoints

* fix: use conditional unet

* clean-up: make style

* fix: type hints for UNet

* clean-up: docstring for UNet

* clean-up: make style

* clean-up: remove duplicate in docstring

* clean-up: make style

* clean-up: make fix-copies

* clean-up: move imports to start in code snippet

* fix: pass cross_attention_dim as a list/tuple to unet

* clean-up: make fix-copies

* fix: update checkpoint path

* fix: unet cross_attention_dim in tests

* film embeddings -> class embeddings

* Apply suggestions from code review

Co-authored-by: Will Berman <wlbberman@gmail.com>

* fix: unet film embed to use existing args

* fix: unet tests to use existing args

* fix: make style

* fix: transformers import and version in init

* clean-up: make style

* Revert "clean-up: make style"

This reverts commit 5d6d1f8b32.

* clean-up: make style

* clean-up: use pipeline tester mixin tests where poss

* clean-up: skip attn slicing test

* fix: add torch dtype to docs

* fix: remove conversion script out of src

* fix: remove .detach from 1d waveform

* fix: reduce default num inf steps

* fix: swap height/width -> audio_length_in_s

* clean-up: make style

* fix: remove nightly tests

* fix: imports in conversion script

* clean-up: slim-down to two slow tests

* clean-up: slim-down fast tests

* fix: batch consistent tests

* clean-up: make style

* clean-up: remove vae slicing fast test

* clean-up: propagate changes to doc

* fix: increase test tol to 1e-2

* clean-up: finish docs

* clean-up: make style

* feat: vocoder / VAE compatibility check

* feat: possibly expand / cut audio waveform

* fix: pipeline call signature test

* fix: slow tests output len

* clean-up: make style

* make style

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: William Berman <WLBberman@gmail.com>
2023-03-23 19:00:21 +01:00
Pedro Cuenca
aa0531fa8d Skip mps in text-to-video tests (#2792)
* Skip mps in text-to-video tests.

* style

* Skip UNet3D mps tests.
2023-03-23 14:39:03 +01:00
Pedro Cuenca
92e1164e2e mps: remove warmup passes (#2771)
* Remove warmup passes in mps tests.

* Update mps docs: no warmup pass in PyTorch 2

* Update imports.
2023-03-22 19:29:27 +01:00
Patrick von Platen
ca1a22296d [MS Text To Video] Add first text to video (#2738)
* [MS Text To Video} Add first text to video

* upload

* make first model example

* match unet3d params

* make sure weights are correcctly converted

* improve

* forward pass works, but diff result

* make forward work

* fix more

* finish

* refactor video output class.

* feat: add support for a video export utility.

* fix: opencv availability check.

* run make fix-copies.

* add: docs for the model components.

* add: standalone pipeline doc.

* edit docstring of the pipeline.

* add: right path to TransformerTempModel

* add: first set of tests.

* complete fast tests for text to video.

* fix bug

* up

* three fast tests failing.

* add: note on slow tests

* make work with all schedulers

* apply styling.

* add slow tests

* change file name

* update

* more correction

* more fixes

* finish

* up

* Apply suggestions from code review

* up

* finish

* make copies

* fix pipeline tests

* fix more tests

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* apply suggestions

* up

* revert

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-03-22 18:39:33 +01:00