diffusers

mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00

Author	SHA1	Message	Date
Will Berman	fd5c3c09af	misc fixes (#2282 ) Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2023-02-08 09:02:42 -08:00
Patrick von Platen	648090e26e	fix pix2pix docs (#2290 )	2023-02-08 16:38:18 +01:00
Patrick von Platen	1ed6b77781	[Examples] Test all examples on CPU (#2289 ) * [Examples] Test all examples on CPU * add * correct * Apply suggestions from code review	2023-02-08 15:59:13 +01:00
Chenguo Lin	9d0d070996	EMA: fix `state_dict()` and `load_state_dict()` & add `cur_decay_value` (#2146 ) * EMA: fix `state_dict()` & add `cur_decay_value` * EMA: fix a bug in `load_state_dict()` 'float' object (`state_dict["power"]`) has no attribute 'get'. * del train_unconditional_ort.py	2023-02-08 10:44:50 +01:00
Isamu Isozaki	c1971a53bc	Textual inv save log memory (#2184 ) * Quality check and adding tokenizer * Adapted stable diffusion to mixed precision+finished up style fixes * Fixed based on patrick's review * Fixed oom from number of validation images * Removed unnecessary np.array conversion --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2023-02-08 10:37:10 +01:00
Patrick von Platen	41db2dbf90	correct tests	2023-02-08 11:12:51 +02:00
Patrick von Platen	a7ca03aa85	Replace flake8 with ruff and update black (#2279 ) * before running make style * remove left overs from flake8 * finish * make fix-copies * final fix * more fixes	2023-02-07 23:46:23 +01:00
Patrick von Platen	f5ccffecf7	Use `accelerate` save & loading hooks to have better checkpoint structure (#2048 ) * better accelerated saving * up * finish * finish * uP * up * up * fix * Apply suggestions from code review * correct ema * Remove @ * up * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/training/dreambooth.mdx Co-authored-by: Pedro Cuenca <pedro@huggingface.co> --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2023-02-07 20:03:59 +01:00
Pedro Cuenca	e619db24be	mps cross-attention hack: don't crash on fp16 (#2258 ) * mps cross-attention hack: don't crash on fp16 * Make conversion explicit.	2023-02-07 19:51:33 +01:00
wfng92	111228cb39	Fix torchvision.transforms and transforms function naming clash (#2274 ) * Fix torchvision.transforms and transforms function naming clash * Update unconditional script for onnx * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2023-02-07 17:36:32 +01:00
Patrick von Platen	bbb46ad3d5	[Tests] Fix slow tests (#2271 )	2023-02-07 14:42:12 +01:00
wfng92	b1dad2e9d3	Make center crop and random flip as args for unconditional image generation (#2259 ) * Add center crop and horizontal flip to args * Update command to use center crop and random flip * Add center crop and horizontal flip to args * Update command to use center crop and random flip	2023-02-07 11:58:31 +01:00
Patrick von Platen	cd52475560	[Examples] Remove datasets important that is not needed (#2267 ) * [Examples] Remove datasets important that is not needed * remove from lora tambien	2023-02-07 11:55:34 +01:00
Patrick von Platen	0f04e799dc	fix vae pt script	2023-02-07 08:34:19 +00:00
YiYi Xu	1051ca81a6	Stable Diffusion Latent Upscaler (#2059 ) * Modify UNet2DConditionModel - allow skipping mid_block - adding a norm_group_size argument so that we can set the `num_groups` for group norm using `num_channels//norm_group_size` - allow user to set dimension for the timestep embedding (`time_embed_dim`) - the kernel_size for `conv_in` and `conv_out` is now configurable - add random fourier feature layer (`GaussianFourierProjection`) for `time_proj` - allow user to add the time and class embeddings before passing through the projection layer together - `time_embedding(t_emb + class_label))` - added 2 arguments `attn1_types` and `attn2_types` * currently we have argument `only_cross_attention`: when it's set to `True`, we will have a to the `BasicTransformerBlock` block with 2 cross-attention , otherwise we get a self-attention followed by a cross-attention; in k-upscaler, we need to have blocks that include just one cross-attention, or self-attention -> cross-attention; so I added `attn1_types` and `attn2_types` to the unet's argument list to allow user specify the attention types for the 2 positions in each block; note that I stil kept the `only_cross_attention` argument for unet for easy configuration, but it will be converted to `attn1_type` and `attn2_type` when passing down to the down blocks - the position of downsample layer and upsample layer is now configurable - in k-upscaler unet, there is only one skip connection per each up/down block (instead of each layer in stable diffusion unet), added `skip_freq = "block"` to support this use case - if user passes attention_mask to unet, it will prepare the mask and pass a flag to cross attention processer to skip the `prepare_attention_mask` step inside cross attention block add up/down blocks for k-upscaler modify CrossAttention class - make the `dropout` layer in `to_out` optional - `use_conv_proj` - use conv instead of linear for all projection layers (i.e. `to_q`, `to_k`, `to_v`, `to_out`) whenever possible. note that when it's used to do cross attention, to_k, to_v has to be linear because the `encoder_hidden_states` is not 2d - `cross_attention_norm` - add an optional layernorm on encoder_hidden_states - `attention_dropout`: add an optional dropout on attention score adapt BasicTransformerBlock - add an ada groupnorm layer to conditioning attention input with timestep embedding - allow skipping the FeedForward layer in between the attentions - replaced the only_cross_attention argument with attn1_type and attn2_type for more flexible configuration update timestep embedding: add new act_fn gelu and an optional act_2 modified ResnetBlock2D - refactored with AdaGroupNorm class (the timestep scale shift normalization) - add `mid_channel` argument - allow the first conv to have a different output dimension from the second conv - add option to use input AdaGroupNorm on the input instead of groupnorm - add options to add a dropout layer after each conv - allow user to set the bias in conv_shortcut (needed for k-upscaler) - add gelu adding conversion script for k-upscaler unet add pipeline * fix attention mask * fix a typo * fix a bug * make sure model can be used with GPU * make pipeline work with fp16 * fix an error in BasicTransfomerBlock * make style * fix typo * some more fixes * uP * up * correct more * some clean-up * clean time proj * up * uP * more changes * remove the upcast_attention=True from unet config * remove attn1_types, attn2_types etc * fix * revert incorrect changes up/down samplers * make style * remove outdated files * Apply suggestions from code review * attention refactor * refactor cross attention * Apply suggestions from code review * update * up * update * Apply suggestions from code review * finish * Update src/diffusers/models/cross_attention.py * more fixes * up * up * up * finish * more corrections of conversion state * act_2 -> act_2_fn * remove dropout_after_conv from ResnetBlock2D * make style * simplify KAttentionBlock * add fast test for latent upscaler pipeline * add slow test * slow test fp16 * make style * add doc string for pipeline_stable_diffusion_latent_upscale * add api doc page for latent upscaler pipeline * deprecate attention mask * clean up embeddings * simplify resnet * up * clean up resnet * up * correct more * up * up * improve a bit more * correct more * more clean-ups * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * add docstrings for new unet config * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * # Copied from * encode the image if not latent * remove force casting vae to fp32 * fix * add comments about preconditioning parameters from k-diffusion paper * attn1_type, attn2_type -> add_self_attention * clean up get_down_block and get_up_block * fix * fixed a typo(?) in ada group norm * update slice attention processer for cross attention * update slice * fix fast test * update the checkpoint * finish tests * fix-copies * fix-copy for modeling_text_unet.py * make style * make style * fix f-string * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix import * correct changes * fix resnet * make fix-copies * correct euler scheduler * add missing #copied from for preprocess * revert * fix * fix copies * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/diffusers/models/cross_attention.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * clean up conversion script * KDownsample2d,KUpsample2d -> KDownsample2D,KUpsample2D * more * Update src/diffusers/models/unet_2d_condition.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * remove prepare_extra_step_kwargs * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix a typo in timestep embedding * remove num_image_per_prompt * fix fasttest * make style + fix-copies * fix * fix xformer test * fix style * doc string * make style * fix-copies * docstring for time_embedding_norm * make style * final finishes * make fix-copies * fix tests --------- Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2023-02-07 09:11:57 +01:00
Patrick von Platen	3b66cc0fc1	make style	2023-02-07 08:11:22 +00:00
chavinlo	717a956a02	Create convert_vae_pt_to_diffusers.py (#2215 ) * Create convert_vae_pt_to_diffusers.py Just a simple script to convert VAE.pt files to diffusers format Tested with: https://huggingface.co/WarriorMama777/OrangeMixs/blob/main/VAEs/orangemix.vae.pt * Update convert_vae_pt_to_diffusers.py Forgot to add the function call * make style --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: chavinlo <example@example.com>	2023-02-07 09:10:34 +01:00
Jorge C. Gomes	d43972ae71	Fixes prompt input checks in StableDiffusion img2img pipeline (#2206 ) * Fixes prompt input checks in img2img Allows providing prompt_embeds instead of the prompt, which is not currently possible as the first check fails. This becomes the same as the function found in `8267c78445/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py (L393)` * Continues the fix This also needs to be fixed. Becomes consistent with `8267c78445/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py (L558)` I've now tested this implementation, and it produces the expected results.	2023-02-07 09:10:24 +01:00
Fazzie-Maqianli	ffed2420c4	fix distributed init twice (#2252 ) fix colossalai dreambooth	2023-02-07 08:55:39 +01:00
Pedro Cuenca	8178c840f2	Mention training problems with xFormers 0.0.16 (#2254 )	2023-02-06 11:19:26 +01:00
nickkolok	3a0d3da66f	Fix a typo: bfloa16 -> bfloat16 (#2243 )	2023-02-06 09:14:08 +01:00
psychedelicious	22c1ba56c2	Fix k_dpm_2 & k_dpm_2_a on MPS (#2241 ) Needed to convert `timesteps` to `float32` a bit sooner. Fixes #1537	2023-02-05 23:45:15 +01:00
Pedro Cuenca	7386e7730c	Show error when loading safety_checker `from_flax` (#2187 ) * Show error when loading safety_checker `from_flax` * fix style	2023-02-04 20:55:11 +01:00
Pedro Cuenca	154a7865fc	[Flax DDPM] Make `key` optional so default pipelines don't fail (#2176 ) Make `key` optional so default pipelines don't fail.	2023-02-04 20:45:20 +01:00
Robin Hutmacher	9baa29e9c0	Fix typo in StableDiffusionInpaintPipeline (#2197 ) * Fix typo in StableDiffusionInpaintPipeline * Add embedded prompt handling --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2023-02-03 19:03:15 +01:00
Jorge C. Gomes	58c416ab0c	Fixes LoRAXFormersCrossAttnProcessor (#2207 ) Related to #2124 The current implementation is throwing a shape mismatch error. Which makes sense, as this line is obviously missing, comparing to XFormersCrossAttnProcessor and LoRACrossAttnProcessor. I don't have formal tests, but I compared `LoRACrossAttnProcessor` and `LoRAXFormersCrossAttnProcessor` ad-hoc, and they produce the same results with this fix.	2023-02-03 18:10:48 +01:00
Isamu Isozaki	d46d78c584	Hotfix textual inv logging (#2183 )	2023-02-03 18:08:46 +01:00
Patrick von Platen	05168e5d83	make style	2023-02-03 19:03:13 +02:00
Justin Merrell	948022e1e8	fix: flagged_images implementation (#1947 ) Flagged images would be set to the blank image instead of the original image that contained the NSF concept for optional viewing. Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2023-02-03 18:02:56 +01:00
Patrick von Platen	2f9a70aa85	[LoRA] Make sure validation works in multi GPU setup (#2172 ) * [LoRA] Make sure validation works in multi GPU setup * more fixes * up	2023-02-03 16:50:10 +01:00
Sayak Paul	e43e206dc7	removes `~`s in favor of full-fledged links. (#2229 ) remove ~ in favor of full-fledged links.	2023-02-03 20:18:39 +05:30
Will Berman	99c39b4012	[nit] negative_prompt typo (#2227 ) * negative_prompt typo * fix	2023-02-03 14:05:50 +01:00
dymil	7547f9b475	Fix timestep dtype in legacy inpaint (#2120 ) * Fix timestep dtype in legacy inpaint This matches the structure in the text2img, img2img, and inpaint ONNX pipelines * Fix style in dtype patch	2023-02-03 13:04:21 +01:00
Prathik Rao	a87e87fcbe	refactor onnxruntime integration (#2042 ) * refactor onnxruntime integration * fix requirements.txt bug * make style * add support for textual_inversion * make style * add readme * cleanup README files * 1/27/2023 update to training scripts * make style * 1/30 update to train_unconditional * style with black-22.8.0 --------- Co-authored-by: Prathik Rao <prathikrao@microsoft.com> Co-authored-by: anton- <anton@huggingface.co>	2023-02-03 12:04:59 +01:00
Dudu Moshe	ecadcdefe1	[Bug] scheduling_ddpm: fix variance in the case of learned_range type. (#2090 ) scheduling_ddpm: fix variance in the case of learned_range type. In the case of learned_range variance type, there are missing logs and exponent comparing to the theory (see "Improved Denoising Diffusion Probabilistic Models" section 3.1 equation 15: https://arxiv.org/pdf/2102.09672.pdf).	2023-02-03 09:42:42 +01:00
Pedro Cuenca	2bbd532990	Docs: short section on changing the scheduler in Flax (#2181 ) * Short doc on changing the scheduler in Flax. * Apply fix from @patil-suraj Co-authored-by: Suraj Patil <surajp815@gmail.com> --------- Co-authored-by: Suraj Patil <surajp815@gmail.com>	2023-02-02 18:52:21 +01:00
Adalberto	68ef0666e2	Create train_dreambooth_inpaint_lora.py (#2205 ) * Create train_dreambooth_inpaint_lora.py * Update train_dreambooth_inpaint_lora.py * Update train_dreambooth_inpaint_lora.py * Update train_dreambooth_inpaint_lora.py * Update train_dreambooth_inpaint_lora.py	2023-02-02 13:15:15 +01:00
Kashif Rasul	7ac95703cd	add CITATION.cff (#2211 ) add citation.cff	2023-02-02 12:46:44 +01:00
Pedro Cuenca	3816c9ad9f	Update xFormers docs (#2208 ) Update xFormers docs.	2023-02-01 19:56:32 +01:00
Patrick von Platen	8267c78445	[Loading] Better error message on missing keys (#2198 ) * up * finish	2023-02-01 14:22:39 +01:00
Muyang Li	4fc7084875	Fix a dimension bug in Transform2d (#2144 ) The dimension does not match when `inner_dim` is not equal to `in_channels`.	2023-02-01 10:11:45 +01:00
Sayak Paul	9213d81bd0	add: guide on kerascv conversion tool. (#2169 ) * add: guide on kerascv conversion tool. * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> * address additional suggestions from review. * change links to documentation-images. * add separate links for training and inference goodies from diffusers. * address Patrick's comments. --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2023-02-01 09:41:00 +01:00
Asad Memon	dd3cae3327	Pass LoRA rank to LoRALinearLayer (#2191 )	2023-02-01 09:40:02 +01:00
Patrick von Platen	f73d0b6bec	[Docs] remove license (#2188 )	2023-01-31 22:11:32 +01:00
Patrick von Platen	d0d7ffffbd	[Docs] Add components to docs (#2175 )	2023-01-31 22:11:14 +01:00
Abhishek Varma	87cf88ed3d	Use `requests` instead of `wget` in `convert_from_ckpt.py` (#2168 ) -- This commit adopts `requests` in place of `wget` to fetch config `.yaml` files as part of `load_pipeline_from_original_stable_diffusion_ckpt` API. -- This was done because in Windows PowerShell one needs to explicitly ensure that `wget` binary is part of the PATH variable. If not present, this leads to the code not being able to download the `.yaml` config file. Signed-off-by: Abhishek Varma <abhishek@nod-labs.com> Co-authored-by: Abhishek Varma <abhishek@nod-labs.com>	2023-01-31 14:35:45 +01:00
Patrick von Platen	60d915fbed	make style	2023-01-31 11:46:48 +00:00
1lint	d1efefe15e	[Breaking change] fix legacy inpaint noise and resize mask tensor (#2147 ) * fix legacy inpaint noise and resize mask tensor * updated legacy inpaint pipe test expected_slice	2023-01-31 12:44:35 +01:00
Sayak Paul	7d96b38b70	[examples] Fix CLI argument in the launch script command for text2image with LoRA (#2171 ) * Update README.md * Update README.md	2023-01-31 09:47:09 +01:00
Dudu Moshe	cedafb8600	[Bug]: fix DDPM scheduler arbitrary infer steps count. (#2076 ) scheduling_ddpm: fix evaluate with lower timesteps count than train. Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2023-01-31 09:13:26 +01:00

1 2 3 4 5 ...

1769 Commits