diffusers

mirror of https://github.com/huggingface/diffusers.git synced 2026-01-29 07:22:12 +03:00

Author	SHA1	Message	Date
Will Berman	2bc0a15818	config fixes (#3060 )	2023-05-05 07:22:12 -07:00
Patrick von Platen	473d4d3961	Fix config prints and save, load of pipelines (#2849 ) * [Config] Fix config prints and save, load * Only use potential nn.Modules for dtype and device * Correct vae image processor * make sure in_channels is not accessed directly * make sure in channels is only accessed via config * Make sure schedulers only access config attributes * Make sure to access config in SAG * Fix vae processor and make style * add tests * uP * make style * Fix more naming issues * Final fix with vae config * change more	2023-05-05 07:22:12 -07:00
Pedro Cuenca	11ad62cb4f	mps: skip unstable test (#3037 )	2023-05-05 07:22:12 -07:00
Andranik Movsisyan	aa028386f3	[Pipeline] Add TextToVideoZeroPipeline (#2954 ) * add TextToVideoZeroPipeline and CrossFrameAttnProcessor * add docs for text-to-video zero * add teaser image for text-to-video zero docs * Fix review changes. Add Documentation. Add test * clean up the codes in pipeline_text_to_video.py. Add descriptive comments and docstrings * make style && make quality * make fix-copies * make requested changes to docs. use huggingface server links for resources, delete res folder * make style && make quality && make fix-copies * make style && make quality * Apply suggestions from code review --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2023-05-05 07:22:12 -07:00
William Berman	567e1caef5	tests and additional scheduler fixes	2023-05-05 07:22:12 -07:00
William Berman	51c7041e0e	fix simple attention processor encoder hidden states ordering	2023-05-05 07:22:12 -07:00
Will Berman	025609e5f3	dynamic threshold sampling bug fixes and docs (#3003 ) dynamic threshold sampling bug fix and docs	2023-05-05 07:22:12 -07:00
Nipun Jindal	f704da537d	[2905]: Add Karras pattern to discrete euler (#2956 ) * [2905]: Add Karras pattern to discrete euler * [2905]: Add Karras pattern to discrete euler * Review comments * Review comments * Review comments * Review comments --------- Co-authored-by: njindal <njindal@adobe.com>	2023-05-05 07:22:12 -07:00
Patrick von Platen	0aa92a53ad	[Pipeline download] Improve pipeline download for index and passed co… (#2980 ) * [Pipeline download] Improve pipeline download for index and passed components * correct * add more tests * up	2023-05-05 07:22:12 -07:00
Daniel Gu	48f2c25f36	Fix bugs in convert_from_ckpt.py and UniDiffuserTextDecoder encode_prefix()	2023-04-24 21:02:05 -07:00
Daniel Gu	3c9b20e066	Alter the U-ViT implementation (UniDiffuserModel and associated building blocks) to more closely match the original UniDiffusers implementation.	2023-04-19 22:37:00 -07:00
Daniel Gu	84781fbd67	Fix noise pred timestep, clip_tokenizer, CLIP image encoding, and other bugs.	2023-04-17 19:06:19 -07:00
Daniel Gu	0300563861	Simplify the UTransformer2DModel / UniDiffuserModel implementation and fix some more bugs.	2023-04-16 22:25:15 -07:00
Daniel Gu	a492e0c86f	Add UniDiffuser classes to __init__ files, modify transformer block to support pre- and post-LN, add fast default tests, fix some bugs.	2023-04-14 00:54:02 -07:00
Daniel Gu	afe5ba0f20	Initial commit for a image-text UniDiffuser pipeline.	2023-04-03 23:40:20 -07:00
Sayak Paul	7139f0e874	fix: norm group test for UNet3D. (#2959 )	2023-04-04 09:01:15 +01:00
Patrick von Platen	d36103a089	[Tests] Speed up test (#2919 ) speed up test	2023-03-31 14:20:46 +01:00
Patrick von Platen	e1144ac20c	Fix slow tests text inv (#2915 ) * fix slow tests * uP	2023-03-31 10:03:32 +01:00
Takuma Mori	0df4ad541f	Add support `Karras sigmas` for StableDiffusionKDiffusionPipeline (#2874 ) * add use_karras_sigmas option thanks @Stax124 * fix sigma_min/max from scheduler.sigmas * add docstring * revert to use k_diffusion_model.sigma, to(device) * add integration test * make style	2023-03-31 09:12:11 +05:30
Pi Esposito	a937e1b594	add load textual inversion embeddings to stable diffusion (#2009 ) * add load textual inversion embeddings draft * fix quality * fix typo * make fix copies * move to textual inversion mixin * make it accept from sd-concept library * accept list of paths to embeddings * fix styling of stable diffusion pipeline * add dummy TextualInversionMixin * add docstring to textualinversionmixin * add load textual inversion embeddings draft * fix quality * fix typo * make fix copies * move to textual inversion mixin * make it accept from sd-concept library * accept list of paths to embeddings * fix styling of stable diffusion pipeline * add dummy TextualInversionMixin * add docstring to textualinversionmixin * add case for parsing embedding from auto1111 UI format Co-authored-by: Evan Jones <evan.a.jones3@gmail.com> Co-authored-by: Ana Tamais <aninhamoraestamais@gmail.com> * fix style after rebase * move textual inversion mixin to loaders * move mixin inheritance to DiffusionPipeline from StableDiffusionPipeline) * update dummy class name * addressed allo comments * fix old dangling import * fix style * proposal * remove bogus * Apply suggestions from code review Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Will Berman <wlbberman@gmail.com> * finish * make style * up * fix code quality * fix code quality - again * fix code quality - 3 * fix alt diffusion code quality * fix model editing pipeline * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Finish --------- Co-authored-by: Evan Jones <evan.a.jones3@gmail.com> Co-authored-by: Ana Tamais <aninhamoraestamais@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Will Berman <wlbberman@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2023-03-30 18:08:39 +01:00
Sayak Paul	13845462db	[Tests] Adds a test to check if `image_embeds` None case is handled properly in `StableUnCLIPImg2ImgPipeline` (#2861 ) * improve stable unclip doc. * add: test to check if image_emebds None case is handled. * apply formatting/	2023-03-28 17:14:08 +01:00
Kashif Rasul	c0afca2d12	updated onnx pndm test (#2811 )	2023-03-28 13:43:24 +01:00
Patrick von Platen	42d950174f	[Init] Make sure shape mismatches are caught early (#2847 ) Improve init	2023-03-28 09:08:28 +01:00
Pedro Cuenca	81125d8499	Make dynamo wrapped modules work with save_pretrained (#2726 ) * Workaround for saving dynamo-wrapped models. * Accept suggestion from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Apply workaround when overriding pipeline components. * Ensure the correct config.json is saved to disk. Instead of the dynamo class. * Save correct module (not compiled one) * Add test * style * fix docstrings * Go back to using string comparisons. PyTorch CPU does not have _dynamo. * Simple test for save_pretrained of compiled models. * Helper function to test whether module is compiled. --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2023-03-28 09:03:21 +02:00
Pedro Cuenca	b10f527577	Helper function to disable custom attention processors (#2791 ) * Helper function to disable custom attention processors. * Restore code deleted by mistake. * Format * Fix modeling_text_unet copy.	2023-03-27 20:31:19 +02:00
Patrick von Platen	4c26cb9cc8	[Tests] Fix slow tests (#2846 )	2023-03-27 18:45:49 +01:00
Kashif Rasul	f6feb69991	Relax DiT test (#2808 ) * Relax DiT test * relax 2 more tests * fix style * skip test on mac due to older protobuf	2023-03-24 11:28:55 +01:00
Bahjat Kawar	37a44bb283	Add ModelEditing pipeline (#2721 ) * TIME first commit * styling. * styling 2. * fixes; tests * apply styling and doc fix. * remove sups. * fixes * remove temp file * move augmentations to const * added doc entry * code quality * customize augmentations * quality * quality --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2023-03-24 13:01:39 +05:30
Sanchit Gandhi	b94880e536	Add AudioLDM (#2232 ) * Add AudioLDM * up * add vocoder * start unet * unconditional unet * clap, vocoder and vae * clean-up: conversion scripts * fix: conversion script token_type_ids * clean-up: pipeline docstring * tests: from SD * clean-up: cpu offload vocoder instead of safety checker * feat: adapt tests to audioldm * feat: add docs * clean-up: amend pipeline docstrings * clean-up: make style * clean-up: make fix-copies * fix: add doc path to toctree * clean-up: args for conversion script * clean-up: paths to checkpoints * fix: use conditional unet * clean-up: make style * fix: type hints for UNet * clean-up: docstring for UNet * clean-up: make style * clean-up: remove duplicate in docstring * clean-up: make style * clean-up: make fix-copies * clean-up: move imports to start in code snippet * fix: pass cross_attention_dim as a list/tuple to unet * clean-up: make fix-copies * fix: update checkpoint path * fix: unet cross_attention_dim in tests * film embeddings -> class embeddings * Apply suggestions from code review Co-authored-by: Will Berman <wlbberman@gmail.com> * fix: unet film embed to use existing args * fix: unet tests to use existing args * fix: make style * fix: transformers import and version in init * clean-up: make style * Revert "clean-up: make style" This reverts commit `5d6d1f8b32`. * clean-up: make style * clean-up: use pipeline tester mixin tests where poss * clean-up: skip attn slicing test * fix: add torch dtype to docs * fix: remove conversion script out of src * fix: remove .detach from 1d waveform * fix: reduce default num inf steps * fix: swap height/width -> audio_length_in_s * clean-up: make style * fix: remove nightly tests * fix: imports in conversion script * clean-up: slim-down to two slow tests * clean-up: slim-down fast tests * fix: batch consistent tests * clean-up: make style * clean-up: remove vae slicing fast test * clean-up: propagate changes to doc * fix: increase test tol to 1e-2 * clean-up: finish docs * clean-up: make style * feat: vocoder / VAE compatibility check * feat: possibly expand / cut audio waveform * fix: pipeline call signature test * fix: slow tests output len * clean-up: make style * make style --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: William Berman <WLBberman@gmail.com>	2023-03-23 19:00:21 +01:00
YiYi Xu	df91c44712	Flax controlnet (#2727 ) * add contronet flax --------- Co-authored-by: yiyixuxu <yixu310@gmail,com>	2023-03-23 05:46:23 -10:00
Pedro Cuenca	aa0531fa8d	Skip `mps` in text-to-video tests (#2792 ) * Skip mps in text-to-video tests. * style * Skip UNet3D mps tests.	2023-03-23 14:39:03 +01:00
Kashif Rasul	2ef9bdd76f	Music Spectrogram diffusion pipeline (#1044 ) * initial TokenEncoder and ContinuousEncoder * initial modules * added ContinuousContextTransformer * fix copy paste error * use numpy for get_sequence_length * initial terminal relative positional encodings * fix weights keys * fix assert * cross attend style: concat encodings * make style * concat once * fix formatting * Initial SpectrogramPipeline * fix input_tokens * make style * added mel output * ignore weights for config * move mel to numpy * import pipeline * fix class names and import * moved models to models folder * import ContinuousContextTransformer and SpectrogramDiffusionPipeline * initial spec diffusion converstion script * renamed config to t5config * added weight loading * use arguments instead of t5config * broadcast noise time to batch dim * fix call * added scale_to_features * fix weights * transpose laynorm weight * scale is a vector * scale the query outputs * added comment * undo scaling * undo depth_scaling * inital get_extended_attention_mask * attention_mask is none in self-attention * cleanup * manually invert attention * nn.linear need bias=False * added T5LayerFFCond * remove to fix conflict * make style and dummy * remove unsed variables * remove predict_epsilon * Move accelerate to a soft-dependency (#1134) * finish * finish * Update src/diffusers/modeling_utils.py * Update src/diffusers/pipeline_utils.py Co-authored-by: Anton Lozhkov <anton@huggingface.co> * more fixes * fix Co-authored-by: Anton Lozhkov <anton@huggingface.co> * fix order * added initial midi to note token data pipeline * added int to int tokenizer * remove duplicate * added logic for segments * add melgan to pipeline * move autoregressive gen into pipeline * added note_representation_processor_chain * fix dtypes * remove immutabledict req * initial doc * use np.where * require note_seq * fix typo * update dependency * added note-seq to test * added is_note_seq_available * fix import * added toc * added example usage * undo for now * moved docs * fix merge * fix imports * predict first segment * avoid un-needed copy to and from cpu * make style * Copyright * fix style * add test and fix inference steps * remove bogus files * reorder models * up * remove transformers dependency * make work with diffusers cross attention * clean more * remove @ * improve further * up * uP * Apply suggestions from code review * Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py * loop over all tokens * make style * Added a section on the model * fix formatting * grammer * formatting * make fix-copies * Update src/diffusers/pipelines/__init__.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/pipelines/spectrogram_diffusion/pipeline_spectrogram_diffusion.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * added callback ad optional ionnx * do not squeeze batch dim * clean up more * upload * convert jax to nnumpy * make style * fix warning * make fix-copies * fix warning * add initial fast tests * add initial pipeline_params * eval mode due to dropout * skip batch tests as pipeline runs on a single file * make style * fix relative path * fix doc tests * Update src/diffusers/models/t5_film_transformer.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/models/t5_film_transformer.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update docs/source/en/api/pipelines/spectrogram_diffusion.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * add MidiProcessor * format * fix org * Apply suggestions from code review * Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py * make style * pin protobuf to <4 * fix formatting * white space * tensorboard needs protobuf --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Anton Lozhkov <anton@huggingface.co>	2023-03-23 14:06:17 +01:00
Naoki Ainoya	14e3a28c12	Rename 'CLIPFeatureExtractor' class to 'CLIPImageProcessor' (#2732 ) The 'CLIPFeatureExtractor' class name has been renamed to 'CLIPImageProcessor' in order to comply with future deprecation. This commit includes the necessary changes to the affected files.	2023-03-23 13:49:22 +01:00
Pedro Cuenca	92e1164e2e	`mps`: remove warmup passes (#2771 ) * Remove warmup passes in mps tests. * Update mps docs: no warmup pass in PyTorch 2 * Update imports.	2023-03-22 19:29:27 +01:00
Patrick von Platen	ca1a22296d	[MS Text To Video] Add first text to video (#2738 ) * [MS Text To Video} Add first text to video * upload * make first model example * match unet3d params * make sure weights are correcctly converted * improve * forward pass works, but diff result * make forward work * fix more * finish * refactor video output class. * feat: add support for a video export utility. * fix: opencv availability check. * run make fix-copies. * add: docs for the model components. * add: standalone pipeline doc. * edit docstring of the pipeline. * add: right path to TransformerTempModel * add: first set of tests. * complete fast tests for text to video. * fix bug * up * three fast tests failing. * add: note on slow tests * make work with all schedulers * apply styling. * add slow tests * change file name * update * more correction * more fixes * finish * up * Apply suggestions from code review * up * finish * make copies * fix pipeline tests * fix more tests * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * apply suggestions * up * revert --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2023-03-22 18:39:33 +01:00
Will Berman	ca1e40726e	stable diffusion depth batching fix (#2757 )	2023-03-21 10:18:44 -07:00
1lint	b33bd91fae	Add option to set dtype in pipeline.to() method (#2317 ) add test_to_dtype to check pipe.to(fp16)	2023-03-21 15:21:23 +01:00
Pedro Cuenca	1fcf279d74	Fix mps tests on torch 2.0 (#2766 )	2023-03-21 15:19:31 +01:00
Alexander Pivovarov	f024e00398	Fix typos (#2715 ) Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2023-03-21 13:45:04 +01:00
Patrick von Platen	fdcff560d0	Fix more slow tests	2023-03-18 19:41:38 +00:00
Patrick von Platen	9ecd924859	[Tests] Correct PT2 (#2724 ) * [Tests] Correct PT2 * correct more * move versatile to nightly * up * up * again * Apply suggestions from code review	2023-03-18 18:38:04 +01:00
Andy	116f70cbf8	Enabling gradient checkpointing for VAE (#2536 ) * updated black format * update black format * make style format * updated line endings * update code formatting * Update examples/research_projects/onnxruntime/text_to_image/train_text_to_image.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/models/vae.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/models/vae.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * added vae gradient checkpointing test * make style --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Will Berman <wlbberman@gmail.com>	2023-03-17 14:59:38 -07:00
Nicolas Patry	d9227cf788	Adding `use_safetensors` argument to give more control to users (#2123 ) * Adding `use_safetensors` argument to give more control to users about which weights they use. * Doc style. * Rebased (not functional). * Rebased and functional with tests. * Style. * Apply suggestions from code review * Style. * Addressing comments. * Update tests/test_pipelines.py Co-authored-by: Will Berman <wlbberman@gmail.com> * Black ??? --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Will Berman <wlbberman@gmail.com>	2023-03-16 15:57:43 +01:00
Patrick von Platen	e828232780	Rename attention (#2691 ) * rename file * rename attention * fix more * rename more * up * more deprecation imports * fixes	2023-03-16 00:35:54 +01:00
YiYi Xu	e52cd55615	Add image_processor (#2617 ) * add image_processor --------- Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2023-03-15 07:55:49 -10:00
Kashif Rasul	cf4227cd1e	T5Attention support for cross-attention (#2654 ) * fix AttnProcessor2_0 Fix use of AttnProcessor2_0 for cross attention with mask * added scale_qk and out_bias flags * fixed for xformers * check if it has scale argument * Update cross_attention.py * check torch version * fix sliced attn * style * set scale * fix test * fixed addedKV processor * revert back AttnProcessor2_0 * if missing if * fix inner_dim --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2023-03-15 18:04:05 +01:00
Sayak Paul	4553c29d92	[Tests] fix: slow serialization test (#2678 ) fix: slow serialization tests	2023-03-15 22:30:21 +05:30
Will Berman	279f744ce5	controlnet integration tests num_inference_steps=3 (#2672 )	2023-03-14 14:42:32 -07:00
clarencechen	ee71d9d03d	Add support for different model prediction types in DDIMInverseScheduler (#2619 ) * Add support for different model prediction types in DDIMInverseScheduler Resolve alpha_prod_t_prev index issue for final step of inversion * Fix old bug introduced when prediction type is "sample" * Add support for sample clipping for numerical stability and deprecate old kwarg * Detach sample, alphas, betas Derive predicted noise from model output before dist. regularization Style cleanup * Log loss for debugging * Revert "Log loss for debugging" This reverts commit `76ea9c856f`. * Add comments * Add inversion equivalence test * Add expected data for Pix2PixZero pipeline tests with SD 2 * Update tests/pipelines/stable_diffusion/test_stable_diffusion_pix2pix_zero.py * Remove cruft and add more explanatory comments --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2023-03-14 21:25:12 +01:00
Ilmari Heikkinen	a7cc468fdb	AutoencoderKL: clamp indices of blend_h and blend_v to input size (#2660 )	2023-03-14 17:06:51 +01:00

1 2 3 4 5 ...

521 Commits