diffusers

mirror of https://github.com/huggingface/diffusers.git synced 2026-01-29 07:22:12 +03:00

Author	SHA1	Message	Date
Patrick von Platen	ef9590712a	[Tests] Relax tolerance of flaky failing test (#3755 ) relax tolerance slightly	2023-06-12 18:28:30 +02:00
Patrick von Platen	74fd735eb0	Add draft for lora text encoder scale (#3626 ) * Add draft for lora text encoder scale * Improve naming * fix: training dreambooth lora script. * Apply suggestions from code review * Update examples/dreambooth/train_dreambooth_lora.py * Apply suggestions from code review * Apply suggestions from code review * add lora mixin when fit * add lora mixin when fit * add lora mixin when fit * fix more * fix more --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2023-06-06 22:47:46 +01:00
Sayak Paul	8669e8313d	[LoRA] feat: add lora attention processor for pt 2.0. (#3594 ) * feat: add lora attention processor for pt 2.0. * explicit context manager for SDPA. * switch to flash attention * make shapes compatible to work optimally with SDPA. * fix: circular import problem. * explicitly specify the flash attention kernel in sdpa * fall back to efficient attention context manager. * remove explicit dispatch. * fix: removed processor. * fix: remove optional from type annotation. * feat: make changes regarding LoRAAttnProcessor2_0. * remove confusing warning. * formatting. * relax tolerance for PT 2.0 * fix: loading message. * remove unnecessary logging. * add: entry to the docs. * add: network_alpha argument. * relax tolerance.	2023-06-06 14:56:05 +05:30
Takuma Mori	b45204ea5a	Add function to remove monkey-patch for text encoder LoRA (#3649 ) * merge undoable-monkeypatch * remove TEXT_ENCODER_TARGET_MODULES, refactoring * move create_lora_weight_file	2023-06-06 14:06:13 +05:30
Will Berman	41ae670828	move activation dispatches into helper function (#3656 ) * move activation dispatches into helper function * tests	2023-06-05 12:30:48 -07:00
Takuma Mori	8e552bb4fe	Support Kohya-ss style LoRA file format (in a limited capacity) (#3437 ) * add _convert_kohya_lora_to_diffusers * make style * add scaffold * match result: unet attention only * fix monkey-patch for text_encoder * with CLIPAttention While the terrible images are no longer produced, the results do not match those from the hook ver. This may be due to not setting the network_alpha value. * add to support network_alpha * generate diff image * fix monkey-patch for text_encoder * add test_text_encoder_lora_monkey_patch() * verify that it's okay to release the attn_procs * fix closure version * add comment * Revert "fix monkey-patch for text_encoder" This reverts commit `bb9c61e6fa`. * Fix to reuse utility functions * make LoRAAttnProcessor targets to self_attn * fix LoRAAttnProcessor target * make style * fix split key * Update src/diffusers/loaders.py * remove TEXT_ENCODER_TARGET_MODULES loop * add print memory usage * remove test_kohya_loras_scaffold.py * add: doc on LoRA civitai * remove print statement and refactor in the doc. * fix state_dict test for kohya-ss style lora * Apply suggestions from code review Co-authored-by: Takuma Mori <takuma104@gmail.com> --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2023-06-02 17:40:24 +05:30
Takuma Mori	67cf0445ef	Fix to apply LoRAXFormersAttnProcessor instead of LoRAAttnProcessor when xFormers is enabled (#3556 ) * fix to use LoRAXFormersAttnProcessor * add test * using new LoraLoaderMixin.save_lora_weights * add test_lora_save_load_with_xformers	2023-05-26 17:33:25 +05:30
Pedro Cuenca	bde2cb5d9b	Run `torch.compile` tests in separate subprocesses (#3503 ) * Run ControlNet compile test in a separate subprocess `torch.compile()` spawns several subprocesses and the GPU memory used was not reclaimed after the test ran. This approach was taken from `transformers`. * Style * Prepare a couple more compile tests to run in subprocess. * Use require_torch_2 decorator. * Test inpaint_compile in subprocess. * Run img2img compile test in subprocess. * Run stable diffusion compile test in subprocess. * style * Temporarily trigger on pr to test. * Revert "Temporarily trigger on pr to test." This reverts commit `82d76868dd`.	2023-05-23 19:24:17 +02:00
Birch-san	64bf5d33b7	Support for cross-attention bias / mask (#2634 ) * Cross-attention masks prefer qualified symbol, fix accidental Optional prefer qualified symbol in AttentionProcessor prefer qualified symbol in embeddings.py qualified symbol in transformed_2d qualify FloatTensor in unet_2d_blocks move new transformer_2d params attention_mask, encoder_attention_mask to the end of the section which is assumed (e.g. by functions such as checkpoint()) to have a stable positional param interface. regard return_dict as a special-case which is assumed to be injected separately from positional params (e.g. by create_custom_forward()). move new encoder_attention_mask param to end of CrossAttn block interfaces and Unet2DCondition interface, to maintain positional param interface. regenerate modeling_text_unet.py remove unused import unet_2d_condition encoder_attention_mask docs Co-authored-by: Pedro Cuenca <pedro@huggingface.co> versatile_diffusion/modeling_text_unet.py encoder_attention_mask docs Co-authored-by: Pedro Cuenca <pedro@huggingface.co> transformer_2d encoder_attention_mask docs Co-authored-by: Pedro Cuenca <pedro@huggingface.co> unet_2d_blocks.py: add parameter name comments Co-authored-by: Pedro Cuenca <pedro@huggingface.co> revert description. bool-to-bias treatment happens in unet_2d_condition only. comment parameter names fix copies, style * encoder_attention_mask for SimpleCrossAttnDownBlock2D, SimpleCrossAttnUpBlock2D * encoder_attention_mask for UNetMidBlock2DSimpleCrossAttn * support attention_mask, encoder_attention_mask in KCrossAttnDownBlock2D, KCrossAttnUpBlock2D, KAttentionBlock. fix binding of attention_mask, cross_attention_kwargs params in KCrossAttnDownBlock2D, KCrossAttnUpBlock2D checkpoint invocations. * fix mistake made during merge conflict resolution * regenerate versatile_diffusion * pass time embedding into checkpointed attention invocation * always assume encoder_attention_mask is a mask (i.e. not a bias). * style, fix-copies * add tests for cross-attention masks * add test for padding of attention mask * explain mask's query_tokens dim. fix explanation about broadcasting over channels; we actually broadcast over query tokens * support both masks and biases in Transformer2DModel#forward. document behaviour * fix-copies * delete attention_mask docs on the basis I never tested self-attention masking myself. not comfortable explaining it, since I don't actually understand how a self-attn mask can work in its current form: the key length will be different in every ResBlock (we don't downsample the mask when we downsample the image). * review feedback: the standard Unet blocks shouldn't pass temb to attn (only to resnet). remove from KCrossAttnDownBlock2D,KCrossAttnUpBlock2D#forward. * remove encoder_attention_mask param from SimpleCrossAttn{Up,Down}Block2D,UNetMidBlock2DSimpleCrossAttn, and mask-choice in those blocks' #forward, on the basis that they only do one type of attention, so the consumer can pass whichever type of attention_mask is appropriate. * put attention mask padding back to how it was (since the SD use-case it enabled wasn't important, and it breaks the original unclip use-case). disable the test which was added. * fix-copies * style * fix-copies * put encoder_attention_mask param back into Simple block forward interfaces, to ensure consistency of forward interface. * restore passing of emb to KAttentionBlock#forward, on the basis that removal caused test failures. restore also the passing of emb to checkpointed calls to KAttentionBlock#forward. * make simple unet2d blocks use encoder_attention_mask, but only when attention_mask is None. this should fix UnCLIP compatibility. * fix copies	2023-05-22 17:27:15 +01:00
Patrick von Platen	51843fd7d0	Refactor full determinism (#3485 ) * up * fix more * Apply suggestions from code review * fix more * fix more * Check it * Remove 16:8 * fix more * fix more * fix more * up * up * Test only stable diffusion * Test only two files * up * Try out spinning up processes that can be killed * up * Apply suggestions from code review * up * up	2023-05-22 11:15:11 +01:00
Will Berman	49b7ccfb96	parameterize pass single args through tuple (#3477 )	2023-05-18 10:14:29 -07:00
Will Berman	909742dbd6	attention refactor: the trilogy (#3387 ) * Replace `AttentionBlock` with `Attention` * use _from_deprecated_attn_block check re: @patrickvonplaten	2023-05-12 08:54:09 -06:00
Sayak Paul	90f5f3c4d4	[Tests] better determinism (#3374 ) * enable deterministic pytorch and cuda operations. * disable manual seeding. * make style && make quality for unet_2d tests. * enable determinism for the unet2dconditional model. * add CUBLAS_WORKSPACE_CONFIG for better reproducibility. * relax tolerance (very weird issue, though). * revert to torch manual_seed() where needed. * relax more tolerance. * better placement of the cuda variable and relax more tolerance. * enable determinism for 3d condition model. * relax tolerance. * add: determinism to alt_diffusion. * relax tolerance for alt diffusion. * dance diffusion. * dance diffusion is flaky. * test_dict_tuple_outputs_equivalent edit. * fix two more tests. * fix more ddim tests. * fix: argument. * change to diff in place of difference. * fix: test_save_load call. * test_save_load_float16 call. * fix: expected_max_diff * fix: paint by example. * relax tolerance. * add determinism to 1d unet model. * torch 2.0 regressions seem to be brutal * determinism to vae. * add reason to skipping. * up tolerance. * determinism to vq. * determinism to cuda. * determinism to the generic test pipeline file. * refactor general pipelines testing a bit. * determinism to alt diffusion i2i * up tolerance for alt diff i2i and audio diff * up tolerance. * determinism to audioldm * increase tolerance for audioldm lms. * increase tolerance for paint by paint. * increase tolerance for repaint. * determinism to cycle diffusion and sd 1. * relax tol for cycle diffusion 🚲 * relax tol for sd 1.0 * relax tol for controlnet. * determinism to img var. * relax tol for img variation. * tolerance to i2i sd * make style * determinism to inpaint. * relax tolerance for inpaiting. * determinism for inpainting legacy * relax tolerance. * determinism to instruct pix2pix * determinism to model editing. * model editing tolerance. * panorama determinism * determinism to pix2pix zero. * determinism to sag. * sd 2. determinism * sd. tolerance * disallow tf32 matmul. * relax tolerance is all you need. * make style and determinism to sd 2 depth * relax tolerance for depth. * tolerance to diffedit. * tolerance to sd 2 inpaint. * up tolerance. * determinism in upscaling. * tolerance in upscaler. * more tolerance relaxation. * determinism to v pred. * up tol for v_pred * unclip determinism * determinism to unclip img2img * determinism to text to video. * determinism to last set of tests * up tol. * vq cumsum doesn't have a deterministic kernel * relax tol * relax tol	2023-05-11 16:38:14 +01:00
Patrick von Platen	425192fe15	Make sure VAE attention works with Torch 2_0 (#3200 ) * Make sure attention works with Torch 2_0 * make style * Fix more	2023-04-22 17:29:29 +01:00
Patrick von Platen	17470057d2	make style	2023-04-20 13:09:20 +02:00
nupurkmr9	3979aac996	adding custom diffusion training to diffusers examples (#3031 ) * diffusers==0.14.0 update * custom diffusion update * custom diffusion update * custom diffusion update * custom diffusion update * custom diffusion update * custom diffusion update * custom diffusion * custom diffusion * custom diffusion * custom diffusion * custom diffusion * apply formatting and get rid of bare except. * refactor readme and other minor changes. * misc refactor. * fix: repo_id issue and loaders logging bug. * fix: save_model_card. * fix: save_model_card. * fix: save_model_card. * add: doc entry. * refactor doc,. * custom diffusion * custom diffusion * custom diffusion * apply style. * remove tralining whitespace. * fix: toctree entry. * remove unnecessary print. * custom diffusion * custom diffusion * custom diffusion test * custom diffusion xformer update * custom diffusion xformer update * custom diffusion xformer update --------- Co-authored-by: Nupur Kumari <nupurkumari@Nupurs-MacBook-Pro.local> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Nupur Kumari <nupurkumari@nupurs-mbp.wifi.local.cmu.edu>	2023-04-20 09:31:42 +02:00
Patrick von Platen	703307efcc	Fix config deprecation (#3129 ) * Better deprecation message * Better deprecation message * Better doc string * Fixes * fix more * fix more * Improve __getattr__ * correct more * fix more * fix * Improve more * more improvements * fix more * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * make style * Fix all rest & add tests & remove old deprecation fns --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2023-04-17 17:16:28 +01:00
Patrick von Platen	ed8fd38337	Improve deprecation warnings (#3131 )	2023-04-17 16:19:11 +01:00
Patrick von Platen	3a9d7d9758	[Tests] parallelize (#3078 ) * [Tests] parallelize * finish folder structuring * Parallelize tests more * Correct saving of pipelines * make sure logging level is correct * try again * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2023-04-13 13:32:57 +01:00
Andy	9d7c08f95e	[WIP] implement rest of the test cases (LoRA tests) (#2824 ) * inital commit for lora test cases * help a bit with lora for 3d * fixed lora tests * replaced redundant code --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2023-04-12 15:32:14 +05:30
Will Berman	c6180a311c	add only cross attention to simple attention blocks (#3011 ) * add only cross attention to simple attention blocks * add test for only_cross_attention re: @patrickvonplaten * mid_block_only_cross_attention better default allow mid_block_only_cross_attention to default to `only_cross_attention` when `only_cross_attention` is given as a single boolean	2023-04-11 14:38:50 -07:00
Patrick von Platen	8b451eb63b	Fix config prints and save, load of pipelines (#2849 ) * [Config] Fix config prints and save, load * Only use potential nn.Modules for dtype and device * Correct vae image processor * make sure in_channels is not accessed directly * make sure in channels is only accessed via config * Make sure schedulers only access config attributes * Make sure to access config in SAG * Fix vae processor and make style * add tests * uP * make style * Fix more naming issues * Final fix with vae config * change more	2023-04-11 13:35:42 +02:00
Sayak Paul	7139f0e874	fix: norm group test for UNet3D. (#2959 )	2023-04-04 09:01:15 +01:00
Patrick von Platen	d36103a089	[Tests] Speed up test (#2919 ) speed up test	2023-03-31 14:20:46 +01:00
Pedro Cuenca	b10f527577	Helper function to disable custom attention processors (#2791 ) * Helper function to disable custom attention processors. * Restore code deleted by mistake. * Format * Fix modeling_text_unet copy.	2023-03-27 20:31:19 +02:00
Sanchit Gandhi	b94880e536	Add AudioLDM (#2232 ) * Add AudioLDM * up * add vocoder * start unet * unconditional unet * clap, vocoder and vae * clean-up: conversion scripts * fix: conversion script token_type_ids * clean-up: pipeline docstring * tests: from SD * clean-up: cpu offload vocoder instead of safety checker * feat: adapt tests to audioldm * feat: add docs * clean-up: amend pipeline docstrings * clean-up: make style * clean-up: make fix-copies * fix: add doc path to toctree * clean-up: args for conversion script * clean-up: paths to checkpoints * fix: use conditional unet * clean-up: make style * fix: type hints for UNet * clean-up: docstring for UNet * clean-up: make style * clean-up: remove duplicate in docstring * clean-up: make style * clean-up: make fix-copies * clean-up: move imports to start in code snippet * fix: pass cross_attention_dim as a list/tuple to unet * clean-up: make fix-copies * fix: update checkpoint path * fix: unet cross_attention_dim in tests * film embeddings -> class embeddings * Apply suggestions from code review Co-authored-by: Will Berman <wlbberman@gmail.com> * fix: unet film embed to use existing args * fix: unet tests to use existing args * fix: make style * fix: transformers import and version in init * clean-up: make style * Revert "clean-up: make style" This reverts commit `5d6d1f8b32`. * clean-up: make style * clean-up: use pipeline tester mixin tests where poss * clean-up: skip attn slicing test * fix: add torch dtype to docs * fix: remove conversion script out of src * fix: remove .detach from 1d waveform * fix: reduce default num inf steps * fix: swap height/width -> audio_length_in_s * clean-up: make style * fix: remove nightly tests * fix: imports in conversion script * clean-up: slim-down to two slow tests * clean-up: slim-down fast tests * fix: batch consistent tests * clean-up: make style * clean-up: remove vae slicing fast test * clean-up: propagate changes to doc * fix: increase test tol to 1e-2 * clean-up: finish docs * clean-up: make style * feat: vocoder / VAE compatibility check * feat: possibly expand / cut audio waveform * fix: pipeline call signature test * fix: slow tests output len * clean-up: make style * make style --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: William Berman <WLBberman@gmail.com>	2023-03-23 19:00:21 +01:00
Pedro Cuenca	aa0531fa8d	Skip `mps` in text-to-video tests (#2792 ) * Skip mps in text-to-video tests. * style * Skip UNet3D mps tests.	2023-03-23 14:39:03 +01:00
Pedro Cuenca	92e1164e2e	`mps`: remove warmup passes (#2771 ) * Remove warmup passes in mps tests. * Update mps docs: no warmup pass in PyTorch 2 * Update imports.	2023-03-22 19:29:27 +01:00
Patrick von Platen	ca1a22296d	[MS Text To Video] Add first text to video (#2738 ) * [MS Text To Video} Add first text to video * upload * make first model example * match unet3d params * make sure weights are correcctly converted * improve * forward pass works, but diff result * make forward work * fix more * finish * refactor video output class. * feat: add support for a video export utility. * fix: opencv availability check. * run make fix-copies. * add: docs for the model components. * add: standalone pipeline doc. * edit docstring of the pipeline. * add: right path to TransformerTempModel * add: first set of tests. * complete fast tests for text to video. * fix bug * up * three fast tests failing. * add: note on slow tests * make work with all schedulers * apply styling. * add slow tests * change file name * update * more correction * more fixes * finish * up * Apply suggestions from code review * up * finish * make copies * fix pipeline tests * fix more tests * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * apply suggestions * up * revert --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2023-03-22 18:39:33 +01:00
Alexander Pivovarov	f024e00398	Fix typos (#2715 ) Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2023-03-21 13:45:04 +01:00
Patrick von Platen	9ecd924859	[Tests] Correct PT2 (#2724 ) * [Tests] Correct PT2 * correct more * move versatile to nightly * up * up * again * Apply suggestions from code review	2023-03-18 18:38:04 +01:00
Andy	116f70cbf8	Enabling gradient checkpointing for VAE (#2536 ) * updated black format * update black format * make style format * updated line endings * update code formatting * Update examples/research_projects/onnxruntime/text_to_image/train_text_to_image.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/models/vae.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/models/vae.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * added vae gradient checkpointing test * make style --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Will Berman <wlbberman@gmail.com>	2023-03-17 14:59:38 -07:00
Nicolas Patry	d9227cf788	Adding `use_safetensors` argument to give more control to users (#2123 ) * Adding `use_safetensors` argument to give more control to users about which weights they use. * Doc style. * Rebased (not functional). * Rebased and functional with tests. * Style. * Apply suggestions from code review * Style. * Addressing comments. * Update tests/test_pipelines.py Co-authored-by: Will Berman <wlbberman@gmail.com> * Black ??? --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Will Berman <wlbberman@gmail.com>	2023-03-16 15:57:43 +01:00
Patrick von Platen	e828232780	Rename attention (#2691 ) * rename file * rename attention * fix more * rename more * up * more deprecation imports * fixes	2023-03-16 00:35:54 +01:00
Kashif Rasul	cf4227cd1e	T5Attention support for cross-attention (#2654 ) * fix AttnProcessor2_0 Fix use of AttnProcessor2_0 for cross attention with mask * added scale_qk and out_bias flags * fixed for xformers * check if it has scale argument * Update cross_attention.py * check torch version * fix sliced attn * style * set scale * fix test * fixed addedKV processor * revert back AttnProcessor2_0 * if missing if * fix inner_dim --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2023-03-15 18:04:05 +01:00
Nicolas Patry	2ea1da89ab	Fix regression introduced in #2448 (#2551 ) * Fix regression introduced in #2448 * Style.	2023-03-04 16:11:57 +01:00
Nicolas Patry	1f4deb697f	Adding support for `safetensors` and LoRa. (#2448 ) * Adding support for `safetensors` and LoRa. * Adding metadata.	2023-03-03 18:00:19 +01:00
Patrick von Platen	eadf0e2555	[Copyright] 2023 (#2524 )	2023-03-01 10:31:00 +01:00
Pedro Cuenca	54bc882d96	`mps` test fixes (#2470 ) * Skip variant tests (UNet1d, UNetRL) on mps. mish op not yet supported. * Exclude a couple of panorama tests on mps They are too slow for fast CI. * Exclude mps panorama from more tests. * mps: exclude all fast panorama tests as they keep failing.	2023-02-24 15:19:53 +01:00
bddppq	5d4f59ee96	Fix running LoRA with xformers (#2286 ) * Fix running LoRA with xformers * support disabling xformers * reformat * Add test	2023-02-13 11:58:18 +01:00
Patrick von Platen	a7ca03aa85	Replace flake8 with ruff and update black (#2279 ) * before running make style * remove left overs from flake8 * finish * make fix-copies * final fix * more fixes	2023-02-07 23:46:23 +01:00
YiYi Xu	1051ca81a6	Stable Diffusion Latent Upscaler (#2059 ) * Modify UNet2DConditionModel - allow skipping mid_block - adding a norm_group_size argument so that we can set the `num_groups` for group norm using `num_channels//norm_group_size` - allow user to set dimension for the timestep embedding (`time_embed_dim`) - the kernel_size for `conv_in` and `conv_out` is now configurable - add random fourier feature layer (`GaussianFourierProjection`) for `time_proj` - allow user to add the time and class embeddings before passing through the projection layer together - `time_embedding(t_emb + class_label))` - added 2 arguments `attn1_types` and `attn2_types` * currently we have argument `only_cross_attention`: when it's set to `True`, we will have a to the `BasicTransformerBlock` block with 2 cross-attention , otherwise we get a self-attention followed by a cross-attention; in k-upscaler, we need to have blocks that include just one cross-attention, or self-attention -> cross-attention; so I added `attn1_types` and `attn2_types` to the unet's argument list to allow user specify the attention types for the 2 positions in each block; note that I stil kept the `only_cross_attention` argument for unet for easy configuration, but it will be converted to `attn1_type` and `attn2_type` when passing down to the down blocks - the position of downsample layer and upsample layer is now configurable - in k-upscaler unet, there is only one skip connection per each up/down block (instead of each layer in stable diffusion unet), added `skip_freq = "block"` to support this use case - if user passes attention_mask to unet, it will prepare the mask and pass a flag to cross attention processer to skip the `prepare_attention_mask` step inside cross attention block add up/down blocks for k-upscaler modify CrossAttention class - make the `dropout` layer in `to_out` optional - `use_conv_proj` - use conv instead of linear for all projection layers (i.e. `to_q`, `to_k`, `to_v`, `to_out`) whenever possible. note that when it's used to do cross attention, to_k, to_v has to be linear because the `encoder_hidden_states` is not 2d - `cross_attention_norm` - add an optional layernorm on encoder_hidden_states - `attention_dropout`: add an optional dropout on attention score adapt BasicTransformerBlock - add an ada groupnorm layer to conditioning attention input with timestep embedding - allow skipping the FeedForward layer in between the attentions - replaced the only_cross_attention argument with attn1_type and attn2_type for more flexible configuration update timestep embedding: add new act_fn gelu and an optional act_2 modified ResnetBlock2D - refactored with AdaGroupNorm class (the timestep scale shift normalization) - add `mid_channel` argument - allow the first conv to have a different output dimension from the second conv - add option to use input AdaGroupNorm on the input instead of groupnorm - add options to add a dropout layer after each conv - allow user to set the bias in conv_shortcut (needed for k-upscaler) - add gelu adding conversion script for k-upscaler unet add pipeline * fix attention mask * fix a typo * fix a bug * make sure model can be used with GPU * make pipeline work with fp16 * fix an error in BasicTransfomerBlock * make style * fix typo * some more fixes * uP * up * correct more * some clean-up * clean time proj * up * uP * more changes * remove the upcast_attention=True from unet config * remove attn1_types, attn2_types etc * fix * revert incorrect changes up/down samplers * make style * remove outdated files * Apply suggestions from code review * attention refactor * refactor cross attention * Apply suggestions from code review * update * up * update * Apply suggestions from code review * finish * Update src/diffusers/models/cross_attention.py * more fixes * up * up * up * finish * more corrections of conversion state * act_2 -> act_2_fn * remove dropout_after_conv from ResnetBlock2D * make style * simplify KAttentionBlock * add fast test for latent upscaler pipeline * add slow test * slow test fp16 * make style * add doc string for pipeline_stable_diffusion_latent_upscale * add api doc page for latent upscaler pipeline * deprecate attention mask * clean up embeddings * simplify resnet * up * clean up resnet * up * correct more * up * up * improve a bit more * correct more * more clean-ups * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * add docstrings for new unet config * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * # Copied from * encode the image if not latent * remove force casting vae to fp32 * fix * add comments about preconditioning parameters from k-diffusion paper * attn1_type, attn2_type -> add_self_attention * clean up get_down_block and get_up_block * fix * fixed a typo(?) in ada group norm * update slice attention processer for cross attention * update slice * fix fast test * update the checkpoint * finish tests * fix-copies * fix-copy for modeling_text_unet.py * make style * make style * fix f-string * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix import * correct changes * fix resnet * make fix-copies * correct euler scheduler * add missing #copied from for preprocess * revert * fix * fix copies * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/diffusers/models/cross_attention.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * clean up conversion script * KDownsample2d,KUpsample2d -> KDownsample2D,KUpsample2D * more * Update src/diffusers/models/unet_2d_condition.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * remove prepare_extra_step_kwargs * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix a typo in timestep embedding * remove num_image_per_prompt * fix fasttest * make style + fix-copies * fix * fix xformer test * fix style * doc string * make style * fix-copies * docstring for time_embedding_norm * make style * final finishes * make fix-copies * fix tests --------- Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2023-02-07 09:11:57 +01:00
Patrick von Platen	f653ded7ed	[LoRA] Make sure LoRA can be disabled after it's run (#2128 )	2023-01-26 21:26:11 +01:00
Patrick von Platen	6ba2231d72	Reproducibility 3/3 (#1924 ) * make tests deterministic * run slow tests * prepare for testing * finish * refactor * add print statements * finish more * correct some test failures * more fixes * set up to correct tests * more corrections * up * fix more * more prints * add * up * up * up * uP * uP * more fixes * uP * up * up * up * up * fix more * up * up * clean tests * up * up * up * more fixes * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> * make * correct * finish * finish Co-authored-by: Suraj Patil <surajp815@gmail.com>	2023-01-25 13:44:22 +01:00
Patrick von Platen	ed616bd8a8	[LoRA] Add LoRA training script (#1884 ) * [Lora] first upload * add first lora version * upload * more * first training * up * correct * improve * finish loaders and inference * up * up * fix more * up * finish more * finish more * up * up * change year * revert year change * Change lines * Add cloneofsimo as co-author. Co-authored-by: Simo Ryu <cloneofsimo@gmail.com> * finish * fix docs * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com> * upload * finish Co-authored-by: Simo Ryu <cloneofsimo@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2023-01-18 18:05:51 +01:00
Patrick von Platen	29b2c93c90	Make repo structure consistent (#1862 ) * move files a bit * more refactors * fix more * more fixes * fix more onnx * make style * upload * fix * up * fix more * up again * up * small fix * Update src/diffusers/__init__.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * correct Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2022-12-30 11:51:08 +01:00
Patrick von Platen	4125756e88	Refactor cross attention and allow mechanism to tweak cross attention function (#1639 ) * first proposal * rename * up * Apply suggestions from code review * better * up * finish * up * rename * correct versatile * up * up * up * up * fix * Apply suggestions from code review * make style * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * add error message Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2022-12-20 18:49:05 +01:00
Patrick von Platen	cd91fc06fe	Re-add xformers enable to UNet2DCondition (#1627 ) * finish * fix * Update tests/models/test_models_unet_2d.py * style Co-authored-by: Anton Lozhkov <anton@huggingface.co>	2022-12-09 14:05:38 +01:00
Suraj Patil	bce65cd13a	[refactor] make set_attention_slice recursive (#1532 ) * make attn slice recursive * remove set_attention_slice from blocks * fix copies * make enable_attention_slicing base class method of DiffusionPipeline * fix set_attention_slice * fix set_attention_slice * fix copies * add tests * up * up * up * update * up * uP Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-12-05 17:31:04 +01:00
Anton Lozhkov	cc22bda5f6	[CI] Add slow MPS tests (#1104 ) * [CI] Add slow MPS tests * fix yml * temporarily resolve caching * Tests: fix mps crashes. * Skip test_load_pipeline_from_git on mps. Not compatible with float16. * Increase tolerance, use CPU generator, alt. slices. * Move to nightly * style Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2022-12-05 11:50:24 +01:00

... 3 4 5 6 7

316 Commits